-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs(release): add note to 1.22 about the CA CN rename #15324
base: master
Are you sure you want to change the base?
Conversation
- this broke some of my teams' automation code that relied on the CN, so thought it would be good to call out in case anyone else stumbles upon this in order to not spend a few hours debugging - this change may particularly impact Prod environments, where the cluster has not been rebuilt in some time (years), and so they will have the old CN while new clusters in lower environments will have the new CN - so code that relies on the CN may unexpectedly break Production while working fine in lower environments - fortunately we caught this in our QA env, but it passed Dev fine
Hi @agilgur5. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@johngmyers any chance you could take a look at this PR? |
I must admit to being at a loss as to why anything would depend on the CN having a particular value. |
The automation I mentioned here was for the Vault k8s integration, which requires the k8s CA. Can see in the screenshot above that this code does TLS inspection to pull the CA, but ends up getting two certs, the CA and the I don't think that's the best way to get the CA, as I mentioned in the opening, but regardless, a team I work with already had code that specifically depended on the CN. I.e. the CN was part of the "public surface" that kOps exposes, and changing the CN broke existing code. Whether or not that code is optimal or why that code exists, the fact that any code exists anywhere that relies on the CN is enough to show this. As is oft stated in Linux kernel development, anything that is exposed will be used (aka the "Workflow" xkcd). So at the very least, I think this should be documented in the changelog, which is all that this PR does. |
@johngmyers can you take a look again? |
I don't think this rises to the level of a release note. This is an implementation detail; the CN of a root CA is not something that is supposed to matter to anything. |
It's a breaking change that broke actual code in production exactly because it wasn't mentioned in the release notes... A reasonable user could suggest much more than a tiny release note when a change broke prod. This is the bare minimum, in my opinion.
It did. This is not a hypothetical... |
kOps 1.22 has been out of support for about a year. One person making an incorrect assumption about internals does not merit a release note. |
I'll repeat once more that this breaking change affects clusters that upgraded to 1.22 and beyond and were not rebuilt in that time. Which is, again, much more likely for Prod clusters that have zero downtime (i.e. high criticality). A cluster that was upgraded from 1.21 to 1.26, which is current, would also have this issue. As written above, that cluster, which would be on 1.26, would have the old CN, while clusters built with 1.22+, and similarly on 1.26, would have the newer CN. Two clusters on 1.26 could have two different CNs, despite having the same configuration. The CA has a different CN depending on what version of kOps the cluster was originally created with (not the version it's on).
I'm not the one who made that assumption (and it was in fact, code reviewed by a whole team that has used kOps for years), and I, myself, literally said there are better ways of implementing their code. It is also a CA, they are not really meant to change frequently either. Whether a CA should be considered an "internal" is debatable. |
One could say that this too, is an assumption. And there is an existing counter-example that shows it did, in fact, matter to something... So one could further say that that is "an incorrect assumption" 🤷 I'm not here to play word games and state opinions, I am stating facts about events that literally already happened. |
The Kubernetes project currently lacks enough contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/remove-lifecycle stale |
The Kubernetes project currently lacks enough contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/remove-lifecycle stale |
The Kubernetes project currently lacks enough contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/remove-lifecycle stale |
The Kubernetes project currently lacks enough contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/remove-lifecycle stale |
The Kubernetes project currently lacks enough contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/remove-lifecycle stale |
Summary
Add a note in 1.22 changelog about #11921 et al's CA changes
Details
Testing Evidence
From an internal fix my team made to another team's automation:
(We've recommend they pull the CA from
/etc/kubernetes/pki/ca.crt
or/var/run/secrets/kubernetes.io/serviceaccount/ca.crt
to be a bit more resilient to changes like this)Review Notes
Feel free to change the language / wording as you see fit or put in a different part of the release notes (I put this under "Other changes of note" right now).
I thought specifically warning about older clusters was important, as otherwise this is particularly easy to overlook