Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

yarn test people runs into RPC response too big #161

Closed
niklasad1 opened this issue Jan 5, 2025 · 9 comments · Fixed by #162
Closed

yarn test people runs into RPC response too big #161

niklasad1 opened this issue Jan 5, 2025 · 9 comments · Fixed by #162

Comments

@niklasad1
Copy link

niklasad1 commented Jan 5, 2025

Hey hey,

I have looked into what caused polkadot-fellows/runtimes#521 to fail which ended up with two different "too big responses":

  1. state_getStorage which could exceed 10MB
  2. state_getRuntimeVersion which shouldn't be as big as 10MB

The dump with RPC responses can be found below:

stdout | packages/kusama/src/people.kusama.e2e.test.ts > Kusama People > adding a registrar as root from the relay chain works
2025-01-05 14:23:48          API-WS: connected to ws://[::]:42597
2025-01-05 14:23:48          API-WS: calling chain_getBlockHash {"id":1,"jsonrpc":"2.0","method":"chain_getBlockHash","params":[0]}
2025-01-05 14:23:48          API-WS: calling state_getRuntimeVersion {"id":2,"jsonrpc":"2.0","method":"state_getRuntimeVersion","params":[]}
2025-01-05 14:23:48          API-WS: calling system_chain {"id":3,"jsonrpc":"2.0","method":"system_chain","params":[]}
2025-01-05 14:23:48          API-WS: calling system_properties {"id":4,"jsonrpc":"2.0","method":"system_properties","params":[]}
2025-01-05 14:23:48          API-WS: calling rpc_methods {"id":5,"jsonrpc":"2.0","method":"rpc_methods","params":[]}
2025-01-05 14:23:48          API-WS: calling system_chain {"id":4,"jsonrpc":"2.0","method":"system_chain","params":[]}
2025-01-05 14:23:48          API-WS: calling system_properties {"id":5,"jsonrpc":"2.0","method":"system_properties","params":[]}
2025-01-05 14:23:48          API-WS: calling chain_getBlockHash {"id":6,"jsonrpc":"2.0","method":"chain_getBlockHash","params":[0]}
2025-01-05 14:23:48          API-WS: calling state_getStorage {"id":7,"jsonrpc":"2.0","method":"state_getStorage","params":["0x3a636f6465","0x459607e5b5e693d31df811a71241fddaea975c03a9827d1cae14f54515a71a3a"]}

<--- omitted logs --->

#1, state_getStorage failed which looks reasonable
2025-01-05 14:23:48          API-WS: received {"jsonrpc":"2.0","error":{"code":-32008,"message":"Response is too big","data":"Exceeded max limit of 1048576"},"id":7}

stderr | packages/kusama/src/people.kusama.e2e.test.ts > Kusama People > adding a registrar as root from the relay chain works
2025-01-05 14:23:48        RPC-CORE: getRuntimeVersion(at?: BlockHash): RuntimeVersion:: -32603: Internal RpcError: -32008: Response is too big: Exceeded max limit of 1048576
2025-01-05 14:23:48        API/INIT: Error: FATAL: Unable to initialize the API: -32603: Internal RpcError: -32008: Response is too big: Exceeded max limit of 1048576
    at ApiPromise.__internal__onProviderConnect (file:///home/niklasad1/Github/polkadot-ecosystem-tests/node_modules/@polkadot/api/base/Init.js:383:27)
    at processTicksAndRejections (node:internal/process/task_queues:95:5)

#2, `state_getRuntimeVersion` which shouldn't hit max limit of 10MB
stdout | packages/kusama/src/people.kusama.e2e.test.ts > Kusama People > adding a registrar as root from the relay chain works
2025-01-05 14:23:48          API-WS: received {"id":1,"jsonrpc":"2.0","result":"0xb0a8d493285c2df73290dfb7e61f870f17b41801197a149ca93654499ea3dafe"}
2025-01-05 14:23:48          API-WS: received {"id":2,"jsonrpc":"2.0","error":{"code":-32603,"message":"Internal RpcError: -32008: Response is too big: Exceeded max limit of 1048576"}}

Thus, the first error comes from the jsonrpsee which looks legit but the second one doesn't come directly from jsonrpsee because it doesn't use such error format for internal errors and doesn't looks correct. My guess is that the second one comes from subway here which wraps the error message an internal error.

/cc @xlc @shunsukew Can you confirm? Maybe you terminate the connection and send back the error to all pending rpc calls or something?!

@xlc
Copy link
Member

xlc commented Jan 5, 2025

Can you find out which storage key is too big to fetch? So that I can tweak the test to avoid it.

The RPC used is

endpoint: 'wss://polkadot-people-rpc.polkadot.io',

Screenshot 2025-01-06 at 11 08 20 AM

And I checked it is not behind Subway

Another thing bugged me is that I can't easily reproduce the error locally. I managed to trigger it once but unable to trigger it on subsequent runs. Can someone check if the endpoint is behind a load balancer?

You can also see that CI runs successfully most of the time, indicating it is unlikely a code problem in this repo or chopsticks. The only explanation I can think of is that the endpoint is a load balancer with multiple RPC nodes and one of it is misbehaving and the test fails only if that particular node handling the requests.

@niklasad1
Copy link
Author

Thanks for the info, lemme debug which storage key failed and ask around about the load balancer settings.

@niklasad1
Copy link
Author

Can you find out which storage key is too big to fetch? So that I can tweak the test to avoid it.

It is "0x3a636f6465" also known as ":code" that causes that and it's between 2MB-4MB depending of which chain.

Another thing bugged me is that I can't easily reproduce the error locally. I managed to trigger it once but unable to trigger it on subsequent runs. Can someone check if the endpoint is behind a load balancer?

Yes, you are correct I tried connecting many times and most of the time it actually works as intended. But I was wrong it's actually rpc-kusama.luckyfriday.io that has at least one node with 1MB as max message size which breaks this.

2025-01-06T21:17:42.453733Z DEBUG jsonrpsee-client: Connecting to target: Target { host: "rpc-kusama.luckyfriday.io", host_header: "rpc-kusama.luckyfriday.io", _mode: Tls, path_and_query: "/", basic_auth: None }
2025-01-06T21:17:42.891083Z DEBUG jsonrpsee-client: Connection established to target: Target { host: "rpc-kusama.luckyfriday.io", host_header: "rpc-kusama.luckyfriday.io", _mode: Tls, path_and_query: "/", basic_auth: None }
2025-01-06T21:17:43.032229Z ERROR ws: wss://rpc-kusama.luckyfriday.io error: Call(ErrorObject { code: ServerError(-32008), message: "Response is too big", data: Some(RawValue("Exceeded max limit of 1048576")) })

I can try to contact them but meanwhile the fix could be to switch to some other provider such wss://kusama-rpc.dwellir.com

@xlc
Copy link
Member

xlc commented Jan 6, 2025

Thanks for the investigation. Will change luckyfriday nodes to dwellir.

@dcolley
Copy link

dcolley commented Jan 7, 2025

@xlc
feel free to connect to wss://kusama.ibp.network or wss://polkadot.ibp.network.
these nodes are geo-load balanced and allow high RPC payload size.

@xlc
Copy link
Member

xlc commented Jan 7, 2025

@dcolley Nice. Where can I find a list of all the support networks and the urls?

@dcolley
Copy link

dcolley commented Jan 7, 2025

@xlc please check this page https://wiki.ibp.network/docs/consumers/archives

For supported chains, the convention is: wss://<chainId>.ibp.network

@mchaffee
Copy link

mchaffee commented Jan 7, 2025

Hey folks, apologies for the LuckyFriday clerical issue. Looks like we fat-fingered one of the nodes to 1MB instead of 10MB. We have since increased all nodes to 20MB.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants