-
Notifications
You must be signed in to change notification settings - Fork 348
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: tx client concurrency test #4104
Conversation
📝 WalkthroughWalkthroughThe pull request introduces modifications to two files in the Changes
Possibly related PRs
Suggested labels
Suggested reviewers
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
pkg/user/e2e_test.go (1)
36-36
: Consider parameterizing the number of transactions.While increasing the test load is good for stress testing, consider making
numTxs
configurable through a test parameter or environment variable. This would allow for quick tests during development while maintaining the ability to run more extensive tests in CI.- numTxs := 100 + numTxs := getTestTxCount() // Add this helper function: +func getTestTxCount() int { + if count, err := strconv.Atoi(os.Getenv("TEST_TX_COUNT")); err == nil { + return count + } + return 100 // default to 100 +}
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
pkg/user/e2e_test.go
(1 hunks)pkg/user/tx_client.go
(1 hunks)
🔇 Additional comments (2)
pkg/user/e2e_test.go (1)
42-42
: Good fix: Buffered error channel prevents goroutine leaks.
The addition of buffer capacity to errCh
is a crucial fix. Previously, if multiple errors occurred simultaneously, the error channel being unbuffered could cause goroutine leaks as the first error might not be read before subsequent errors tried to write to the channel.
pkg/user/tx_client.go (1)
466-468
: Good addition: Proper context cancellation handling.
The addition of the context error check ensures that user-initiated cancellations are properly propagated instead of being masked by the "transaction not found" error. This maintains the correct error semantics and helps with debugging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TestConcurrentTxSubmission fails in CI: https://github.com/celestiaorg/celestia-app/actions/runs/12430560032/job/34706233115?pr=4104#step:4:39
Is it a flake? I just retried
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should prolly be resolved by now
Yup, will try investigate |
Blocked on celestiaorg/celestia-core#1582 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
pkg/user/e2e_test.go (2)
42-44
: Consider parameterizing the test data sizeWhile increasing the number of transactions to 100 provides better coverage, consider making this configurable through a test parameter or constant to facilitate different load testing scenarios.
- numTxs := 100 + const defaultNumTxs = 100 + numTxs := defaultNumTxs
70-75
: Consider enhancing error reportingThe error handling is correct but could benefit from more detailed error reporting for debugging purposes.
select { case err := <-errCh: - require.NoError(t, err) + require.NoError(t, err, "Failed during concurrent transaction submission: %v", err) default: }
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
pkg/user/e2e_test.go
(3 hunks)
🔇 Additional comments (4)
pkg/user/e2e_test.go (4)
26-36
: LGTM: Comprehensive mempool version testing setupThe test now properly covers all mempool versions with appropriate timeout configuration. The increased timeout commit of 10 seconds provides adequate time for transaction processing in concurrent scenarios.
38-40
: LGTM: Clean client initializationThe test client initialization is properly error-handled and follows the standard pattern.
47-50
: LGTM: Fixed error channel capacityThe buffered error channel with capacity 1 is a good fix. This ensures the first error can be captured without blocking, addressing the issue mentioned in the PR objectives where errors were being ignored.
52-68
: Verify timeout handling in high-load scenariosThe one-minute timeout might need adjustment based on system performance under high load with 100 concurrent transactions.
Also, consider adding logging to track the duration of transaction processing:
go func(b *share.Blob) { defer wg.Done() + start := time.Now() _, err := txClient.SubmitPayForBlob(subCtx, []*share.Blob{b}, user.SetGasLimitAndGasPrice(500_000, appconsts.DefaultMinGasPrice)) + if time.Since(start) > 30*time.Second { + t.Logf("Transaction took longer than 30s to process") + } if err != nil && !errors.Is(err, context.Canceled) {✅ Verification successful
Let me gather more information about the test file and its context.
One minute timeout is appropriate for concurrent transaction submission test
The one-minute timeout is well-justified for this test scenario because:
- The test submits 100 concurrent transactions with blobs of size 2048 bytes each
- The test node is configured with a 10-second block time (
TimeoutCommit = 10 * time.Second
)- The test runs against three different mempool versions (v0, v1, v2)
- Each transaction has a gas limit of 500,000 which is sufficient for the blob size
- The timeout acts as a safety mechanism to prevent test hangs while allowing enough time for transaction processing across multiple blocks
The timeout duration provides an adequate buffer considering:
- Block time (10s) × Multiple blocks needed for 100 txs
- Network setup and initial height wait time
- Transaction processing overhead across different mempool versions
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Check for any timeout-related issues in test logs # Look for patterns indicating timeout issues or slow transaction processing rg -A 3 "context deadline exceeded|timeout|took longer than expected"Length of output: 29896
Script:
#!/bin/bash # Check the test file content to understand the test scenario cat pkg/user/e2e_test.goLength of output: 2167
This PR makes two small tweaks: - Fixes `TestConcurrentTxSubmission` by adding a capacity of 1 to the errCh. Currently errors were being ignored because the wait group meant that there wasn't a process to read to the channel as it was being written to. This fixes this - Catches the case where a user cancels the context when calling `ConfirmTx` **This test is broken until celestiaorg/celestia-core#1553 is resolved** --------- Co-authored-by: nina / ნინა <[email protected]> (cherry picked from commit f21716b)
This PR makes two small tweaks: - Fixes `TestConcurrentTxSubmission` by adding a capacity of 1 to the errCh. Currently errors were being ignored because the wait group meant that there wasn't a process to read to the channel as it was being written to. This fixes this - Catches the case where a user cancels the context when calling `ConfirmTx` **This test is broken until celestiaorg/celestia-core#1553 is resolved**<hr>This is an automatic backport of pull request #4104 done by [Mergify](https://mergify.com). Co-authored-by: Callum Waters <[email protected]>
This PR makes two small tweaks:
TestConcurrentTxSubmission
by adding a capacity of 1 to the errCh. Currently errors were being ignored because the wait group meant that there wasn't a process to read to the channel as it was being written to. This fixes thisConfirmTx
This test is broken until celestiaorg/celestia-core#1553 is resolved