-
Notifications
You must be signed in to change notification settings - Fork 201
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Event loop blocked inside vertx-pg-client during DNS resolution #1429
Comments
when it happens can you make a thread dump ? how long does it block ? |
As reported: 2806 ms This is on a production instance, so no I don't think I can make a thread dump. We run our production containers pretty lean, so it may have been CPU throttled during startup:
|
does it always happen with the same stack trace ? |
We've only just started tracking blocked threads, for some reason we only logged vertx Thanks for the rapid response 😄 |
Just adding additional stack traces as they occur...
|
|
|
For all of these additional stack traces, the pod had started 2 minutes prior - so this would be one of the first DNS lookups when connecting to the DB host |
Thanks for your reports @sishbi . A few questions:
I read in the bug report that you use K8S. Perhaps the combined cost of |
It only seems to report this just after container start. So I would say it reaches a steady state.
I am not sure what you mean to test without a security manager. This is a deployed application bundled in a jar running in JDK 21.
I would guess that to be the case. Thanks again for the help and advice. |
After another deployment, I see a blocked thread again - it was reported twice for the same thread, here are the 2 reports:
|
it is not clear why this is happening, what is the activity done with the PG pool when this happen ? |
Questions
Thread Thread[vert.x-eventloop-thread-1,5,main] has been blocked for 2806 ms, time limit is 2000 ms
Version
4.5.4
Context
I think this only occurs during startup on the first time the service needs to lookup the IP address for the database host.
Our service is running within an AWS k8s cluster and uses AWS PostgreSQL RDS.
Do you have a reproducer?
I don't know if the issue is related to class loading or with the DNS request.
We have configured the 'include stack trace' option in the blocked-thread warning at 2 seconds, the same interval as the warning as we are just starting to look at these warnings and found that the reports that don't include the stack are harder to diagnose - it only lists the thread but as it is just a generic 'vertx-thread-1' then we can't find out exactly what was happening at the time.
But is the blocked thread report affected by always including the stack trace? If we didn't include the stack would the ClassLoader be included in the reported stack? Or should we read past that point and look at the first non-class-loader class?
So, in this case:
io.netty.handler.codec.dns.DatagramDnsQuery.<init>
Steps to reproduce
Extra
OS: AWS Linux (Docker container), x64 host
JVM: JDK 21, eclipse-temurin:21-jdk (Docker)
The text was updated successfully, but these errors were encountered: