-
Notifications
You must be signed in to change notification settings - Fork 26.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] The client receives an application offline change notification, but channel.close is not triggered 100% of the time #14938
Comments
Normally it will be also notifyed by RegistryDirectory, because it is dual registration(interface & instance). when all of them finished destory, the count of ReferenceCountExchangeClient will be zero, so it will be destory. |
@laywin That is now the Destory of the ServiceDisCoveryRegistrydirectory, but the Destroyunovokers triggered by the registryDirectory did not print the destruction log. That is, nowadays, all INVOKEER destroyed. ReferenceCountexchangeClient's counter is not 0, and it will not netty client close. How should I check why it is not ruined? Intersection Normally, ServiceDiscoveryRegistryDirectory and RegistryDirectory will be triggered? Or is there any ready -made case for reference? |
when provider shutdown, in consumer side you can watch the network connect (netstat -apn | grep 20880) |
@laywin It is difficult to recurrent this problem, and provider may only appear once 10 times. Under normal circumstances, the "destroy invoker" of ServicesCoveryRegiskRyDirectory and RegistryDirectory will trigger. Sometimes the two are missing one before, and then the "Netty Channl" cannot be closed. Do you have any good suggestions for reappearance? |
@TerryLam2010 Can you test on the latest Dubbo 3.2 version. We have fixed several concurrency issues in 3.2 |
@AlbumenJ I am using Dubbo 3.2.3 in the test environment. The issue is that the instance-level offline notification for the ServiceDiscoveryRegistryDirectory has come, but the interface-level notification for RegistryDirectory has not, resulting in the counter of ReferenceCountExchangeClient not being zero.Could you tell me how the refreshInvoker in RegistryDirectory can be triggered? Is there a sequence of events between ServiceDiscoveryRegistryDirectory and RegistryDirectory? |
We have fixed couple of bugs about connections in 3.2.x and it is hard to figure out which. :) |
Pre-check
Search before asking
Apache Dubbo Component
Java SDK (apache/dubbo)
Dubbo Version
Dubbo Java: 3.2.3
JDK8
Use K8s
Steps to reproduce this issue
Provider and Consumer config:
<dubbo:application name="${spring.application.name}" logger="slf4j" register-mode="all" metadata-type="remote" >
<dubbo:parameter key="qos.enable" value="false" />
</dubbo:application>
other config property:
dubbo.registry.parameters.register-consumer-url : true
dubbo.application.serialize-check-status : DISABLE
dubbo.application.enable-empty-protection : true
dubbo.provider.prefer-serialization : hessian2
dubbo.provider.serialization : hessian2
I'm from dubbo 2.7.8 upgrade to 3.2.3, found that when the provider referrals, consumers occasionally trigger org.apache.dubbo.remoting.exchange.support.header.ReconnectTimerTask rewiring (only once), The provider itself closes the channel. But I've looked through the code, and if nacos is notified to go offline, the client will destroy the old DubboInvoker, triggering the channel shutdown and leaving the ReconnectTimerTask empty.
My situation now is that I see the log org.apache.dubbo.registry.client.ServiceDiscoveryRegistryDirectory class has come in, also found that destroy the invoker corresponding log, However, the Close netty channel log did not find that although the application offline notification came, it did not close the channel.
Because in the process of gradual launch, the logs I give you are all test environments, all in the 3.2.3 version. The production environment is basically 2.7.8, and only one application is 3.2.3
——————————————————————————————————————————————————
我是从dubbo 2.7.8升级到3.2.3,发现当提供者下线的时候,消费者偶尔触发org.apache.dubbo.remoting.exchange.support.header.ReconnectTimerTask的重连(仅一次),是因为provider自己关闭channel了。但我浏览过代码,如果通知了nacos下线,client会将旧的DubboInvoker销毁,从而触发channel关闭以及将ReconnectTimerTask的任务置空。
现在我的情况是,我看日志org.apache.dubbo.registry.client.ServiceDiscoveryRegistryDirectory类有进来,也发现destroy invoker对应的日志,但是Close netty channel 的日志并没有发现,那就是虽然应用下线通知过来了,但是并没有关闭channel。
What you expected to happen
I think org.apache.dubbo.rpc.protocol.dubbo.ReferenceCountExchangeClientcounter problem, lead to can not close the channel. Because my local online and offline is normal, only in the test environment and online environment is repeated, just look at the log can not see the problem
Anything else
2024-11-22T14:53:36,348+0800 [DUBBO] destroy invoker[DefaultServiceInstance{serviceName='order', host='172.29.89.45', port=11060, enabled=true, healthy=true, metadata={dubbo.endpoints=[{"port":11060,"protocol":"dubbo"}], dubbo.metadata.revision=e9aa0faa8181f81a787a5ce1362ce1c7, dubbo.metadata.storage-type=remote, timestamp=1732256698869}}, service{name='com.xxx.xxx.client.rpc.OrderRuleCheckRpcService',group='uat',version='null',protocol='dubbo',port='11060',params={side=provider, heartbeat=30000, release=3.2.3, methods=check,checkAndThrowEx, logger=slf4j, deprecated=false, dubbo=2.0.2, threads=20, interface=com.xxxx.xxxx.client.rpc.OrderRuleCheckRpcService, service-name-mapping=true, threadpool=cached, timeout=1000, generic=false, revision=1.0.0, serialize.check.status=DISABLE, serialization=hessian2, retries=0, metadata-type=remote, application=order, prefer.serialization=hessian2, dynamic=true, enable-empty-protection=true, group=uat},}] success. , dubbo version: 3.2.3, current host: 172.29.119.195
2024-11-22T14:53:36,348+0800 [DUBBO] 1 deprecated invokers deleted., dubbo version: 3.2.3, current host: 172.29.119.195
2024-11-22T14:53:36,349+0800 [DUBBO] serviceKey:uat/com.xxxx.xxxx.client.rpc.OrderRuleCheckRpcService Instance address size 2, interface address size 3, threshold 0.0, dubbo version: 3.2.3, current host: 172.29.119.195
2024-11-22T14:53:36,349+0800 [DUBBO] Received invokers changed event from registry. Registry type: instance. Service Key: uat/com.xxxx.xxxx.client.rpc.OrderRuleCheckRpcService. Urls Size : 2. Invokers Size : 2. Available Size: 2. Available Invokers : 172.29.92.255:11060,172.29.89.9:11060, dubbo version: 3.2.3, current host: 172.29.119.195
It does not make sense that these logs are triggered, but Close Netty channel does not occur.
触发了这些日志,但是并没有发生Close Netty channel, 这并不合理。
Are you willing to submit a pull request to fix on your own?
Code of Conduct
The text was updated successfully, but these errors were encountered: