-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Policy Server: reduce memory usage #528
Comments
I've started looking into this issue during the last days. TLDR: I think we can close this issue because we do not have high memory usage. There are some changes with regards to our architecture that we could do, but these would reduce the request per seconds that can be processed by a single instance of Policy Server. The environmentI've ran the tests against Kubewarden 1.7.0-rc3. The tests have been done running a Policy Server with Everything has been installed from our helm charts. I've enabled the recommended policies of the It's important to highlight I've not loaded any of the new "vanilla WASI" policies introduced by 1.7.0. The numbers shown inside of the memory reduction PR were skewed by the usage of the go-wasi-template policy. This is a policy built with the official Go compiler. It's huge (~20 Mb) and provides a distorted picture of our memory usage, see this comment I've just added on the closed PR. To recap, I've been testing a Policy Server hosting 10 policies: Contents of
|
Another possible activity for the future: rewrite the load-tests to make use of k6. Currently the load-tests are written with locus, which is doing a fine job. However, if we were to migrate to k6 we could then have a performance testing environment made of:
The most interesting - and appealing - advantage of this solution is the ability to correlate the load generated by the k6 suite and the resource usage. Right now this analysis required quite some manual work from my side. It would be great to have something that produces better and more consistent results. IMHO this should be the 1st thing we should address |
I agree with @flavio's arguments and we can leave this enhancements for the future. Maybe we can keep the improvements in the performance tests in the queue and leave the further changes for the future. Therefore, when we found any possible issue, we bring those improvements to the table again. |
This is incredible work, thanks for the insights and the learning pointers. From what is presented, I agree on the priorities, and I would also like to tackle a dynamic worker pool and trying |
We are using Kubewarden in some of our staging clusters and with 61 cluster-wide policies deployed the policy-server is using a lot of memory (40-50GB). Any suggestion or optimization I could do on my side? |
Can you provide some details about your usage pattern, this could help us fine tune the optimization ideas we have. Some questions:
|
@ish-xyz I'm currently working on a fix for this issue. Can you please provide more information about your environment (see previous comment)? |
Reduce the memory consumption of Policy Server when multiple instances of the same Wasm module are loaded. Thanks to this change, a worker will have only once instance of `PolicyEvaluator` (hence of wasmtime stack), per unique type of module. Literally speaking, if a user has the `apparmor` policy deployed 5 times (different names, settings,...) only one instance of `PolicyEvaluator` will be allocated for it. Note: the optimization works at the worker level. Meaning that `PolicyEvaluator` are NOT sharing these instances between themselves. This commit helps to address issue kubewarden/kubewarden-controller#528 Signed-off-by: Flavio Castelli <[email protected]>
JFYI: there's a PR under review that reduces the amount of memory consume: kubewarden/policy-server#596 |
Reduce the memory consumption of Policy Server when multiple instances of the same Wasm module are loaded. Thanks to this change, a worker will have only once instance of `PolicyEvaluator` (hence of wasmtime stack), per unique type of module. Literally speaking, if a user has the `apparmor` policy deployed 5 times (different names, settings,...) only one instance of `PolicyEvaluator` will be allocated for it. Note: the optimization works at the worker level. Meaning that `PolicyEvaluator` are NOT sharing these instances between themselves. This commit helps to address issue kubewarden/kubewarden-controller#528 Signed-off-by: Flavio Castelli <[email protected]>
Reduce the memory consumption of Policy Server when multiple instances of the same Wasm module are loaded. Thanks to this change, a worker will have only once instance of `PolicyEvaluator` (hence of wasmtime stack), per unique type of module. Literally speaking, if a user has the `apparmor` policy deployed 5 times (different names, settings,...) only one instance of `PolicyEvaluator` will be allocated for it. Note: the optimization works at the worker level. Meaning that `PolicyEvaluator` are NOT sharing these instances between themselves. This commit helps to address issue kubewarden/kubewarden-controller#528 Signed-off-by: Flavio Castelli <[email protected]>
@flavio |
Reduce the memory consumption of Policy Server when multiple instances of the same Wasm module are loaded. Thanks to this change, a worker will have only once instance of `PolicyEvaluator` (hence of wasmtime stack), per unique type of module. Literally speaking, if a user has the `apparmor` policy deployed 5 times (different names, settings,...) only one instance of `PolicyEvaluator` will be allocated for it. Note: the optimization works at the worker level. Meaning that `PolicyEvaluator` are NOT sharing these instances between themselves. This commit helps to address issue kubewarden/kubewarden-controller#528 Signed-off-by: Flavio Castelli <[email protected]>
@ish-xyz we've finished a major refactor of the policy-server codebase (see kubewarden/policy-server#596 (comment)). Beginning of next year (Jan 2024), we are going to release a new version of Kubewarden. In the meantime it would be great if you could give a quick test to the changes by consuming the |
@ish-xyz Flavio pointed me at this issue, we have another support case with this problems. Kubewarden when installed spams replicasets and floods the apiserver eating up memory. HE suggested I collect info from the customer and add it here: • How did you get this metric, is that via prometheus/cAdvisor? If you are referring to the number of replicasets, it is the one reported in Rancher's UI. NAME POLICY SERVER MUTATING BACKGROUNDAUDIT MODE OBSERVED MODE STATUS AGE All of them have been provided by Rancher, the kubewarden-mandatorynamespacelabels-policy is not using any regex, just checking there is a projectId label to prevent namespaces from being created outside projects.
• Is the memory consumption increasing over the time? I wonder if there's any policy that is leaking memory. No, we have not seen the memory consumption increasing overtime in any of the Kubewarden pods. For instance, every Policy server pod is consuming 700M flat. What happened was related to apiserver in the control plane. I will attach some screenshots from the CPU, memory and disk utilization around the first time it happened. |
To clarify the last comment @vincebrannon, it seems that the culprit was misconfigured mutating policies that were interacting fighting with a k8s controller, therefore ReplicaSets were continously created. This has been fixed for default policies (that now only target pods) in Kubewarden 1.10.0. |
For general memory consumption, the proposed changes (and more) are now included in the newly released Kubewarden 1.10.0. This release reduces memory footprint, and more importantly, the memory consumption is now constant regardless of number of thread workers or horizontal scalling. You can read more about it in the 1.10.0 release blogpost. Closing this card then. Please don't hesitate to reopen, comment here, or open any other card if this becomes an issue again! |
While setting up Kubewarden at a demo booth of SUSECon we noticed that policy-server was consuming more memory compared to the other workloads defined inside of the cluster.
This lead to some quick investigation that resulted in this fix. We however decided to schedule more time to look into other improvements.
The text was updated successfully, but these errors were encountered: