-
Notifications
You must be signed in to change notification settings - Fork 388
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: clean up p2p
& implement missing peering functionality
#2852
Conversation
Codecov ReportAttention: Patch coverage is 📢 Thoughts on this report? Let us know! |
seed
peers) supportp2p
& implement bootnode (seed
peers) support
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you check somewhere to not have a number of peers greater than max_num_outboud_peers
?
I see some calls to NumOutbound(), but i don't think i saw this limitation.
I've made a test with a local node, and i'm failing to peer with test5.
A devnet cluster with this PR would be amazing to make some tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to review this PR as thoroughly as I could, but it is too much. I vote to merge as it is and change/fix things as needed.
🛠 PR Checks SummaryAll Automated Checks passed. ✅ Manual Checks (for Reviewers):
Read More🤖 This bot helps streamline PR reviews by verifying automated checks and providing guidance for contributors and reviewers. ✅ Automated Checks (for Contributors):No automated checks match this pull request. ☑️ Contributor Actions:
☑️ Reviewer Actions:
📚 Resources:Debug
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not easy to review the whole PR since it involves a lot of changes. I generally understood that it aims to make all the p2p networking longer smooth, and the PR description and the README (tm2/pkg/p2p/README.md
) explains the main goals perfectly.
I tend to agree with @ajnavarro that we should somehow merge&see, since it is not easy to imagine any possible corner case or error handling required.
Finally I think that this is a great improvement and apart from maybe more testing and tighter error handling, it should be considered as the cornerstone for any other future improvement related to the p2p networking.
## Description Closes #2308 There is a lot happening in this PR, so don't be discouraged by the files changed. I'll outline the way the PR should be reviewed, and give you pointers along the way. ## What this PR set out to do This PR initially set out to implement peer discovery in the TM2 `p2p` module -- that's it, basically. The change was meant to be minimal, and not involve larger changes. After spending more time than I'd like to admit in the `p2p` codebase, I've come to a couple of realizations: - the code is insanely complex for what it needs to be doing. - there are premature "optimizations" on every corner, with no real reason for them. - the unit tests in the `p2p` package are sketchy to say the least, and not convincing at all that we were covering actual functionality we needed to cover. - there are random disconnection issues with larger clusters. All of these are temporarily fine, and not blocking us. But the pattern was there -- to add peer discovery, it would require continuing the same pattern of code gymnastics present in the `p2p` module for the last 5+ years. I took this opportunity, ahead of the mainnet launch, to fill out a few checkboxes: - **simplify the code, trim _everything_ that's excess**. This is in line with our project `PHILOSOPHY.md`. - create a safety net in the form of integration tests, and unit tests, that run _instantly_, and give us the confidence stuff actually works. - make it easy peasy for us to debug future p2p problems, **when** they arise (we already keep seeing them on existing testnets, like `test4` and `test5`. I wanted to implement all of this, without breaking any existing TM2 functionality that relies on the `p2p` module, or introducing additional complexities. ## What this PR actually accomplished I'm proud to say that this PR brings more than a few bells and whistles to the table, in terms of TM2 improvements: - _greatly_ simplified and faster `Switch` and `Transport` implementations. No more gazillion redundant checks, or expensive lookups, or convoluted APIs. - a unit testing suite we can be proud of, and have confidence in. I rewrote the entire testing suite for the module, because the old implementation had severe limitations on mock-ability. - peer discovery that works, and is not invasive. Goodbye random network pockets, and hanging nodes! - many bugs, and potential issues squashed and erased. Regressions added, and passing. For the sake of not making groundbreaking changes ahead of mainnet, I didn't touch a few things, and this will be evident from the code: - I didn't touch the `conn` package, or how the multiplex connections are established and maintained (or the STS implementation) -- this would be too much, and require an exponential amount of time to get right. - the `Peer` abstraction is still the same, and TM2 modules interact with the peers in the same way as before (directly, as `Reactor`s). I've outlined the issues with this in the README, so check it out. In retrospect, I should've limited the scope of this PR by a lot. At the time I was at the mid-way point, I committed fully to leaving this module in a better state than I found it in, rather than leave additional tech debt for future cleanup. The other primary goal of this PR is to scope out changes needed to upgrade the networking layer implementation into utilizing a stack like libp2p. I am happy to say that we have 0 limitations in terms of p2p functionality to make the switch. We're just bound by time. This upgrade is scheduled for after the mainnet MVP launch 🤞 ## How do I review this PR? There is no point in looking at the older implementation, and trying to figure out the changes from there. There are just too many, and it can get overwhelming quickly. Instead, as the **first step** -- read the new `p2p` [README](https://github.com/gnolang/gno/blob/dev/zivkovicmilos/bootnodes/tm2/pkg/p2p/README.md). It outlines how the `p2p` module works on a core level, and highlights current challenges. After the README, open the `p2p` package in `tm2/pkg/p2p`, and start looking at the implementation from there. Leave comments on things that are unclear, or can be improved. I'll try to answer and give as much context as I can. ## What `p2p` config params are changed? Here is a complete list of changed `p2p` configuration params, and the reasoning behind them: - `UPNP` - UPNP port forwarding, it was completely unused, **removed**. - `PexReactor` - peer exchange reactor enable-ment flag, renamed to `PeerExchange`, **rename** - `SeedMode` - enabled network crawls, was tied to `PexReactor` being `true`. Useless flag, since `PeerExchange` exists, **removed** - `AllowDuplicateIP` - useless flag to prevent same-IP dials, even outside previous dial "filters", **removed** - `HandshakeTimeout` - excessive config option for setting an STS timeout, sane default is `3s`, **removed** - `DialTimeout` - excessive config option for finishing a peer dial, sane default is `3s`, **removed** The following config options were **removed**, as they related to testing, and were replaced by unit / integration tests. - `TestDialFail` - `TestFuzz` - `TestFuzzConfig` <details><summary>Contributors' checklist...</summary> - [x] Added new tests, or not needed, or not feasible - [x] Provided an example (e.g. screenshot) to aid review or the PR is self-explanatory - [x] Updated the official documentation or not needed - [x] No breaking changes were made, or a `BREAKING CHANGE: xxx` message was included in the description - [x] Added references to related issues and PRs - [ ] Provided any useful hints for running manual tests - [ ] Added new benchmarks to [generated graphs](https://gnoland.github.io/benchmarks), if any. More info [here](https://github.com/gnolang/gno/blob/master/.benchmarks/README.md). </details>
Description
Closes #2308
There is a lot happening in this PR, so don't be discouraged by the files changed.
I'll outline the way the PR should be reviewed, and give you pointers along the way.
What this PR set out to do
This PR initially set out to implement peer discovery in the TM2
p2p
module -- that's it, basically.The change was meant to be minimal, and not involve larger changes.
After spending more time than I'd like to admit in the
p2p
codebase, I've come to a couple of realizations:p2p
package are sketchy to say the least, and not convincing at all that we were covering actual functionality we needed to cover.All of these are temporarily fine, and not blocking us.
But the pattern was there -- to add peer discovery, it would require continuing the same pattern of code gymnastics present in the
p2p
module for the last 5+ years.I took this opportunity, ahead of the mainnet launch, to fill out a few checkboxes:
PHILOSOPHY.md
.test4
andtest5
.I wanted to implement all of this, without breaking any existing TM2 functionality that relies on the
p2p
module, or introducing additional complexities.What this PR actually accomplished
I'm proud to say that this PR brings more than a few bells and whistles to the table, in terms of TM2 improvements:
Switch
andTransport
implementations. No more gazillion redundant checks, or expensive lookups, or convoluted APIs.For the sake of not making groundbreaking changes ahead of mainnet, I didn't touch a few things, and this will be evident from the code:
conn
package, or how the multiplex connections are established and maintained (or the STS implementation) -- this would be too much, and require an exponential amount of time to get right.Peer
abstraction is still the same, and TM2 modules interact with the peers in the same way as before (directly, asReactor
s). I've outlined the issues with this in the README, so check it out.In retrospect, I should've limited the scope of this PR by a lot. At the time I was at the mid-way point, I committed fully to leaving this module in a better state than I found it in, rather than leave additional tech debt for future cleanup.
The other primary goal of this PR is to scope out changes needed to upgrade the networking layer implementation into utilizing a stack like libp2p. I am happy to say that we have 0 limitations in terms of p2p functionality to make the switch. We're just bound by time. This upgrade is scheduled for after the mainnet MVP launch 🤞
How do I review this PR?
There is no point in looking at the older implementation, and trying to figure out the changes from there.
There are just too many, and it can get overwhelming quickly.
Instead, as the first step -- read the new
p2p
README. It outlines how thep2p
module works on a core level, and highlights current challenges.After the README, open the
p2p
package intm2/pkg/p2p
, and start looking at the implementation from there. Leave comments on things that are unclear, or can be improved. I'll try to answer and give as much context as I can.What
p2p
config params are changed?Here is a complete list of changed
p2p
configuration params, and the reasoning behind them:UPNP
- UPNP port forwarding, it was completely unused, removed.PexReactor
- peer exchange reactor enable-ment flag, renamed toPeerExchange
, renameSeedMode
- enabled network crawls, was tied toPexReactor
beingtrue
. Useless flag, sincePeerExchange
exists, removedAllowDuplicateIP
- useless flag to prevent same-IP dials, even outside previous dial "filters", removedHandshakeTimeout
- excessive config option for setting an STS timeout, sane default is3s
, removedDialTimeout
- excessive config option for finishing a peer dial, sane default is3s
, removedThe following config options were removed, as they related to testing, and were replaced by unit / integration tests.
TestDialFail
TestFuzz
TestFuzzConfig
Contributors' checklist...
BREAKING CHANGE: xxx
message was included in the description