From: Chris Horn Date: Wed, 1 Jun 2022 02:19:07 +0000 (-0500) Subject: LU-15929 lnet: Correct net selection for router ping X-Git-Tag: 2.15.52~62 X-Git-Url: https://git.whamcloud.com/?p=fs%2Flustre-release.git;a=commitdiff_plain;h=2431e099b143a4c7e7f951c912263f1536db07f0 LU-15929 lnet: Correct net selection for router ping lnet_find_best_ni_on_local_net() contains logic for restricting the NI selection to a net specified by lnet_peer::lp_disc_net_id. The purpose of this is to ensure that LNet peers ping every interface on a router at a regular interval as part of the LNet router health feature. However, this logic is flawed because lnet_msg_discovery() is used to determine whether the message being sent is a discovery message, but that function actually determines whether a given message can _trigger_ discovery. Introduce a new function, lnet_msg_is_ping(), which determines whether a given lnet_msg is a GET on the LNET_RESERVED_PORTAL. Modify lnet_find_best_ni_on_local_net() to restrict NI selection to lp_disc_net_id iff: 1. lp_disc_net_id is non-zero 2. The peer has the LNET_PEER_RTR_DISCOVERY flag set. 3. lnet_msg_is_ping() returns true Test-Parameters: trivial testlist=sanity-lnet HPE-bug-id: LUS-11017 Signed-off-by: Chris Horn Change-Id: I3dbdfd5c44b6167d24b7b6e0b1097db0b3c5cb76 Reviewed-on: https://review.whamcloud.com/47527 Tested-by: jenkins Tested-by: Maloo Reviewed-by: Frank Sehr Reviewed-by: Cyril Bordage Reviewed-by: Oleg Drokin --- diff --git a/lnet/lnet/lib-move.c b/lnet/lnet/lib-move.c index 2a30af8..e23a532 100644 --- a/lnet/lnet/lib-move.c +++ b/lnet/lnet/lib-move.c @@ -1762,7 +1762,8 @@ lnet_reserved_msg(struct lnet_msg *msg) return false; } -/* +/* Can the specified message trigger peer discovery? + * * Traffic to the LNET_RESERVED_PORTAL may not trigger peer discovery, * because such traffic is required to perform discovery. We therefore * exclude all GET and PUT on that portal. We also exclude all ACK and @@ -1776,6 +1777,18 @@ lnet_msg_discovery(struct lnet_msg *msg) return !(lnet_reserved_msg(msg) || lnet_msg_is_response(msg)); } +/* Is the specified message an LNet ping? + */ +static bool +lnet_msg_is_ping(struct lnet_msg *msg) +{ + if (msg->msg_type == LNET_MSG_GET && + msg->msg_hdr.msg.get.ptl_index == LNET_RESERVED_PORTAL) + return true; + + return false; +} + #define SRC_SPEC 0x0001 #define SRC_ANY 0x0002 #define LOCAL_DST 0x0004 @@ -2432,10 +2445,14 @@ lnet_find_best_ni_on_local_net(struct lnet_peer *peer, int md_cpt, __u32 best_net_sel_prio = LNET_MAX_SELECTION_PRIORITY; __u32 net_sel_prio; - /* if this is a discovery message and lp_disc_net_id is - * specified then use that net to send the discovery on. + /* If lp_disc_net_id is set, this peer is a router undergoing + * discovery, and this message is an LNet ping, then this may be a + * discovery message and we need to select an NI on the peer net + * specified by lp_disc_net_id */ - if (discovery && peer->lp_disc_net_id) { + if (peer->lp_disc_net_id && + (peer->lp_state & LNET_PEER_RTR_DISCOVERY) && + lnet_msg_is_ping(msg)) { best_lpn = lnet_peer_get_net_locked(peer, peer->lp_disc_net_id); if (best_lpn && lnet_get_net_locked(best_lpn->lpn_net_id)) goto select_best_ni;