Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] ✨ Add clusterctl changes for cross-ns CC ref #11395

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

Danil-Grigorev
Copy link
Member

@Danil-Grigorev Danil-Grigorev commented Nov 11, 2024

What this PR does / why we need it:
This PR allows clusterctl to create a template with a reference to CC located in a different namespace.
This change allows to perform e2e tests in a multi-ns environment.

Added e2e tests to verify cluster-class quick-start scenario and runtime extension scenario in a cross-namespaced context.

The PR is built on top of:

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #5673

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-area PR is missing an area label labels Nov 11, 2024
@k8s-ci-robot
Copy link
Contributor

This PR is currently missing an area label, which is used to identify the modified component when generating release notes.

Area labels can be added by org members by writing /area ${COMPONENT} in a comment

Please see the labels list for possible areas.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot requested a review from elmiko November 11, 2024 10:19
@k8s-ci-robot k8s-ci-robot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Nov 11, 2024
@Danil-Grigorev Danil-Grigorev force-pushed the standardize-references-full branch 5 times, most recently from d92a5c2 to 7560e66 Compare November 12, 2024 11:33
@Danil-Grigorev Danil-Grigorev changed the title [WIP] ✨ E2E test for the classNamespace usage [WIP] ✨ Add clusterctl changes for cross-ns CC ref Nov 12, 2024
@Danil-Grigorev Danil-Grigorev changed the title [WIP] ✨ Add clusterctl changes for cross-ns CC ref ✨ Add clusterctl changes for cross-ns CC ref Nov 12, 2024
@k8s-ci-robot k8s-ci-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. labels Nov 12, 2024
@Danil-Grigorev Danil-Grigorev force-pushed the standardize-references-full branch from 7560e66 to 09ce978 Compare November 18, 2024 19:46
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Nov 18, 2024
@fabriziopandini fabriziopandini changed the title ✨ Add clusterctl changes for cross-ns CC ref [WIP] ✨ Add clusterctl changes for cross-ns CC ref Nov 25, 2024
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 25, 2024
@fabriziopandini
Copy link
Member

Renamed wip since it builds on a PR not yet merged

Copy link
Member

@fabriziopandini fabriziopandini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made first pass, I will get back to this when the PR it builds on top are merged

name: '${CLUSTER_NAME}'
spec:
topology:
classNamespace: '${CLUSTER_CLASS_NAMESPACE}'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if we can avoid creating a new template by changing cluster-with-topology.yaml so it uses empty string as a default value in case CLUSTER_CLASS_NAMESPACE is not set

Comment on lines 42 to 43
// This test is meant to provide a first, fast signal to detect regression; it is recommended to use it as a PR blocker test.
// NOTE: This test works with Clusters with and without ClusterClass.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering we are trying to keep PR-blocking duration as short as possible, and that the code paths touched by this change are limited and very specific to CC management, would it be possible to add this test on a PR only on demand + on periodic (once it is added to a PR, then it became blocking)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently using this as a way to quickly test subset of scenarios without having to invoke full-test-suite. I’ll unmark it from pre-submits. Should all added test cases be on demand + perodic?

Comment on lines 66 to 67
namespace, cancelWatches = framework.SetupSpecNamespace(ctx, specName, input.BootstrapClusterProxy, input.ArtifactFolder, input.PostNamespaceCreated)
clusterNamespace, cancelWatchesCluster = framework.SetupSpecNamespace(ctx, specName, input.BootstrapClusterProxy, input.ArtifactFolder, input.PostNamespaceCreated)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about having the cluster in the spec namespace (as we usually do) and create a second namespace for the CC with a name equal to the first namespace + a "cluster-class" suffix; also we probably don't need watches in the CC namespace.

This way when we look at the test, most of the triage work will be consistent with what we usually do, and in case we need to look at the CC, it will be easy to find as it is because it is a namespace with the same root name as everything else

@@ -44,7 +46,7 @@ func addClusterClassIfMissing(ctx context.Context, template Template, clusterCla
return template, nil
}

clusterClassesTemplate, err := fetchMissingClusterClassTemplates(ctx, clusterClassClient, clusterClient, classes, targetNamespace, listVariablesOnly)
clusterClassesTemplate, err := fetchMissingClusterClassTemplates(ctx, clusterClassClient, clusterClient, classes, listVariablesOnly)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to check if I'm getting this right.
As of today, when we are deploying objects, clusterctl allows to override the namespace set into templates (use case: the user wants to deploy to namespace xyz instead of namespace original).

With this PR we are changing this behaviour, and always preserving the namespace from the template when we are adding the cluster class to the template being deployed (cc preserve: namespace original).

If this is correct, and if I'm not wrong, this change might break users that are deploying a Cluster with ClassNamespace empty, and are expecting that also the CC gets deployed in the target namespace xyz along side with the cluster.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It shouldn’t break these users, since regular process of overriding the namespace of the templated resources is still performed. This new field was never a part of this logic, and the SoT here is the template initially.

This change avoids problems in discovering the existence of the CC in the specified namespace, allowing to template only a Cluster resource.

There is no support in clusterclt to distinguish that some of the resources should go into A namespace, and some other into B, so this multi-namespace setup is intended to happen in a multiple passes.

  1. ClusterClass and referenced templates
  2. A cluster resource in a different namespace
  3. repeat 2 for any namespace without failures on clusterctl side, or any existing template modifications.

Copy link
Member

@sbueringer sbueringer Jan 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not entirely sure, but I think this shouldn't break anything.

The namespace-thing for cluster-templates is only envsubst, right?

(while for provider installations it's a lot more)

EDIT: I'll take that one back

Copy link
Member

@sbueringer sbueringer Jan 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My best guess is that it still works as good as before :)

clusterInitialized: false,
objs: []client.Object{},
targetNamespace: "ns1",
clusterClassNamespace: "ns2",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have a similar test when both namespaces are the same, and one where both are empty?

@Danil-Grigorev Danil-Grigorev force-pushed the standardize-references-full branch from 09ce978 to 2d4a64e Compare December 4, 2024 14:34
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Dec 4, 2024
@Danil-Grigorev Danil-Grigorev force-pushed the standardize-references-full branch from 2d4a64e to c181e3e Compare December 4, 2024 15:22
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Dec 4, 2024
@Danil-Grigorev Danil-Grigorev force-pushed the standardize-references-full branch from c181e3e to e155375 Compare December 5, 2024 12:18
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Dec 5, 2024
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 21, 2024
@Danil-Grigorev Danil-Grigorev force-pushed the standardize-references-full branch from e155375 to 9b2cce5 Compare January 7, 2025 19:57
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign sbueringer for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 7, 2025
@Danil-Grigorev Danil-Grigorev force-pushed the standardize-references-full branch from 9b2cce5 to b2cbee3 Compare January 14, 2025 17:37
Copy link
Member

@sbueringer sbueringer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initial round of review.

But let's wait a bit with adjustments. I have to talk to @fabriziopandini a bit about the test coverage we want / need

namespace *corev1.Namespace
cancelWatches context.CancelFunc
input ClusterUpgradeWithRuntimeSDKSpecInput
namespace, infraNamespace *corev1.Namespace
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe let's also call this clusterClassNamespace
(and cancelClusterClassNamespaceWatches below)

@@ -95,6 +95,9 @@ type ClusterUpgradeWithRuntimeSDKSpecInput struct {
ExtensionServiceNamespace string
// ExtensionServiceName is the name of the service to configure in the test-namespace scoped ExtensionConfig.
ExtensionServiceName string

// ClassNamespace is an optional class namespace reference, configuring cross-namespace cluster class reference
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the godoc sounds a bit strange and also the field name :)

Is this more like: DeployClusterClassInSeparateNamespace ?

Hm just noticed that it is a *string in the quickstart test. Do we care about the specific namespace used or is a boolean flag enough?
(would be good if both tests have ~ the same input parameter)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started with bool, but then realized that the NS name generation has a seed, specName in this case. And these names may overlap, so I left the value being specified on demand. May be also clear when collecting logs, where the resources are located (not just 2 randomly generated names as it would otherwise be)

variables["CLUSTER_CLASS_NAMESPACE"] = *input.ClassNamespace
}

By("Creating a cluster referencing a clusterClass from another namespace")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not always true

- Add documentation on securing cross-namespace access for CC
- Add ByClusterClassRef index
- Support cross-ns CC rebase

Signed-off-by: Danil-Grigorev <[email protected]>
Signed-off-by: Danil-Grigorev <[email protected]>
@Danil-Grigorev Danil-Grigorev force-pushed the standardize-references-full branch 2 times, most recently from 3084938 to 9b17207 Compare January 15, 2025 17:09
@Danil-Grigorev Danil-Grigorev force-pushed the standardize-references-full branch from 9b17207 to 0ee7554 Compare January 15, 2025 18:05
@k8s-ci-robot
Copy link
Contributor

@Danil-Grigorev: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-cluster-api-e2e-blocking-main 0ee7554 link true /test pull-cluster-api-e2e-blocking-main

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-area PR is missing an area label do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow using a ClusterClass across namespaces
4 participants