Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Network clustering #1053

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Conversation

edlerd
Copy link
Collaborator

@edlerd edlerd commented Jan 11, 2025

Done

  • adjust network list to show cluster member specific networks
  • split network list entries and detail pages to be one entry/page per cluster member for physical interfaces
  • add pattern for cluster specific inputs for a physical managed networks parent. This is to be reused for other cluster specific inputs like in server settings or for storage pool configuration

QA

  1. Run the LXD-UI:
  2. Perform the following QA steps:
    • Browse the network list in an unclustered backend, check the filters
    • Browse the network list in a clustered backend, use the filters and clicking on the cluster member chips applies filtering
    • Create and edit a physical network in an unclustered backend
    • Create and edit a physical network in a clustered backend, ensure the connections diagram is updating and the chips in it linking correctly. Ensure the chips in the "parent" selector link correctly. Try changing and breaking the parent selector in the clustered backend when creating or editing a physical network.
    • Browse a physical unmanaged network in a clustered and unclustered backend

@webteam-app
Copy link

@edlerd edlerd force-pushed the network-clustering branch 6 times, most recently from 54c1afa to e2e4075 Compare January 15, 2025 15:44
@edlerd edlerd changed the title Network clustering (wip) Network clustering Jan 15, 2025
@edlerd edlerd marked this pull request as ready for review January 15, 2025 15:46
@edlerd edlerd force-pushed the network-clustering branch 2 times, most recently from add1260 to 2379d77 Compare January 15, 2025 16:49
Copy link

@MasWho MasWho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some interim code comments. The network list with filtering looks pretty good (cluster and non-cluster conditions) from QA perspective. Will go through the other QA items as well.

src/pages/networks/NetworkSearchFilter.tsx Outdated Show resolved Hide resolved
src/pages/networks/NetworkSearchFilter.tsx Outdated Show resolved Hide resolved
src/pages/networks/NetworkSearchFilter.tsx Outdated Show resolved Hide resolved
src/api/networks.tsx Outdated Show resolved Hide resolved
src/api/networks.tsx Outdated Show resolved Hide resolved
src/api/networks.tsx Outdated Show resolved Hide resolved
src/api/networks.tsx Outdated Show resolved Hide resolved
src/api/networks.tsx Outdated Show resolved Hide resolved
src/pages/networks/NetworkList.tsx Outdated Show resolved Hide resolved
src/pages/networks/NetworkList.tsx Outdated Show resolved Hide resolved
@mas-who
Copy link
Collaborator

mas-who commented Jan 16, 2025

Some QA observations below:

  1. When clicking on the cluster member resource links in the network list table, all existing search params gets cleared. Why not add the the member search to the existing search params? Is it because it would be a bit weird in the case that the member search is already present, and it would look like nothing is happening?

  2. Noticed that on smaller screen sizes the action buttons and the search filter are positioned weirdly. Maybe we can add the actions in a contextual menu similar to how we do it in the instance detail page. Then there will be more space for the search and filter as well. wdyt?
    Screenshot from 2025-01-16 13-40-30

  3. When creating a clustered physical network with the following parent configs
    Screenshot from 2025-01-16 14-00-19
    On submission I get the following error:
    Screenshot from 2025-01-16 14-01-03
    Trying to create the network with the same name fails at this point as it already exist in LXD.
    Screenshot from 2025-01-16 14-01-46
    I think we should disable the submit button if parents are not selected for all members?
    NOTE: the edit case seems fine, the backend seems to block the operation and the configs does not get persisted.

  4. After creating a physical network with some parent interface across cluster members, trying to create another network using the same parent interfaces results in a creation error indicating they are in use. However, the network still gets created in an "Errored" state.

  5. Clicking on the member resource link on a network detail page does not result in search params being set after redirect to the network list page. Is that intended?

  6. After creating an OVN network with a physical uplink, it is possible to delete the physical uplink. However, on the network topology for the OVN network, the deleted uplink still shows up some how.

  7. Observations for when a cluster member is down:
    a. Trying to create a physical network results in an error message "peer node 10.94.160.130:8443 is down". However, the network gets created and shows up in the network list with the "Cluster-wide" member category.
    b. It is not possible to edit a physical network with the same error message as above.
    c. It is not possible to delete the physical network with the same error message as above.
    d. trying to visit the physical network detail page for the vm that is down results in a 500 error "Missing event connection with target cluster member". Currently the detail page loads for a long time then shows a blank page, this should probably be handled by displaying the error message.

@edlerd edlerd force-pushed the network-clustering branch 3 times, most recently from 8f766bf to b0927a5 Compare January 16, 2025 17:22
@edlerd
Copy link
Collaborator Author

edlerd commented Jan 16, 2025

Resolved issues 1-3 and 5.

  1. After creating a physical network with some parent interface across cluster members, trying to create another network using the same parent interfaces results in a creation error indicating they are in use. However, the network still gets created in an "Errored" state.

This sounds like the expected behaviour. Wdyt?

  1. After creating an OVN network with a physical uplink, it is possible to delete the physical uplink. However, on the network topology for the OVN network, the deleted uplink still shows up some how.

This shows up, because the uplink config of the OVN network doesn't change when deleting the uplink network. It might be that LXD should refuse to delete the uplink, but that should be in the API then, not in the UI itself.

  1. Observations for when a cluster member is down:
    a. Trying to create a physical network results in an error message "peer node 10.94.160.130:8443 is down". However, the network gets created and shows up in the network list with the "Cluster-wide" member category.
    b. It is not possible to edit a physical network with the same error message as above.
    c. It is not possible to delete the physical network with the same error message as above.
    d. trying to visit the physical network detail page for the vm that is down results in a 500 error "Missing event connection with target cluster member". Currently the detail page loads for a long time then shows a blank page, this should probably be handled by displaying the error message.

This needs future work.

@edlerd edlerd force-pushed the network-clustering branch 2 times, most recently from 016a2d5 to 5b8158b Compare January 16, 2025 17:48
@mas-who
Copy link
Collaborator

mas-who commented Jan 17, 2025

Resolved issues 1-3 and 5.

  1. After creating a physical network with some parent interface across cluster members, trying to create another network using the same parent interfaces results in a creation error indicating they are in use. However, the network still gets created in an "Errored" state.

This sounds like the expected behaviour. Wdyt?

My main concern is that the network gets created even when it's not valid. Would it be possible to check upon parent selection if it is already used by another network and reflect that as an error message on the creation / edit form? If that's too complex I think the current behaviour is also fine.

  1. After creating an OVN network with a physical uplink, it is possible to delete the physical uplink. However, on the network topology for the OVN network, the deleted uplink still shows up some how.

This shows up, because the uplink config of the OVN network doesn't change when deleting the uplink network. It might be that LXD should refuse to delete the uplink, but that should be in the API then, not in the UI itself.

Noted, perhaps we should raise this with the core team?

  1. Observations for when a cluster member is down:
    a. Trying to create a physical network results in an error message "peer node 10.94.160.130:8443 is down". However, the network gets created and shows up in the network list with the "Cluster-wide" member category.
    b. It is not possible to edit a physical network with the same error message as above.
    c. It is not possible to delete the physical network with the same error message as above.
    d. trying to visit the physical network detail page for the vm that is down results in a 500 error "Missing event connection with target cluster member". Currently the detail page loads for a long time then shows a blank page, this should probably be handled by displaying the error message.

This needs future work.
Noted 👍

@edlerd edlerd force-pushed the network-clustering branch from 5b8158b to 9c42ee6 Compare January 17, 2025 08:45
@edlerd
Copy link
Collaborator Author

edlerd commented Jan 17, 2025

Improved the error handling, that should resolve 7.

  1. After creating a physical network with some parent interface across cluster members, trying to create another network using the same parent interfaces results in a creation error indicating they are in use. However, the network still gets created in an "Errored" state.

This sounds like the expected behaviour. Wdyt?

My main concern is that the network gets created even when it's not valid. Would it be possible to check upon parent selection if it is already used by another network and reflect that as an error message on the creation / edit form? If that's too complex I think the current behaviour is also fine.

I am not 100% sure we can never reuse an interface as parent. Reported this edge case to lxd: canonical/lxd#14810

  1. After creating an OVN network with a physical uplink, it is possible to delete the physical uplink. However, on the network topology for the OVN network, the deleted uplink still shows up some how.

This shows up, because the uplink config of the OVN network doesn't change when deleting the uplink network. It might be that LXD should refuse to delete the uplink, but that should be in the API then, not in the UI itself.

Noted, perhaps we should raise this with the core team?

I tried to reproduce it, but was getting an error. I couldn't delete the uplink in use by the OVN network. So this might not be an issue after all.

@edlerd edlerd force-pushed the network-clustering branch 3 times, most recently from fb1b4e9 to b52956e Compare January 17, 2025 12:22
Copy link
Collaborator

@mas-who mas-who left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking really good! Couldn't actually find that many issues QA wise, left some code comments

src/api/networks.tsx Show resolved Hide resolved
src/api/networks.tsx Outdated Show resolved Hide resolved
src/pages/networks/NetworkList.tsx Show resolved Hide resolved
src/util/intersection.tsx Outdated Show resolved Hide resolved
src/util/networkForm.tsx Outdated Show resolved Hide resolved
src/components/ClusterSpecificSelect.tsx Outdated Show resolved Hide resolved
src/components/ClusterSpecificSelect.tsx Show resolved Hide resolved
src/components/ClusterSpecificSelect.tsx Outdated Show resolved Hide resolved
src/pages/networks/forms/NetworkParentSelector.tsx Outdated Show resolved Hide resolved
@mas-who
Copy link
Collaborator

mas-who commented Jan 17, 2025

QA comments:

  1. On medium screen size, the network list table gets cut off, should we make the table width adjust with viewport width until they turn into cards?
    Screenshot from 2025-01-17 15-47-06

  2. When creating a physical network, should we pre-select a network interface when the "Same for all cluster members" option is checked?

@edlerd edlerd force-pushed the network-clustering branch from b52956e to 0bc7a06 Compare January 20, 2025 13:40
@edlerd
Copy link
Collaborator Author

edlerd commented Jan 20, 2025

All open issues mentioned above should be resolved.

@edlerd edlerd requested review from mas-who and MasWho January 20, 2025 13:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants