From: James Simmons Date: Fri, 5 Apr 2024 00:01:32 +0000 (-0400) Subject: LU-17700 lnet: properly calculate ping buffer size X-Git-Tag: 2.15.63~87 X-Git-Url: https://git.whamcloud.com/?a=commitdiff_plain;h=c987469d510a8edf2bad5c28239337b47015c82e;hp=0176629ab3f71e88850ab95796b0e519c4d0f740;p=fs%2Flustre-release.git LU-17700 lnet: properly calculate ping buffer size Originally for lnet_ping() we allocated the ping buffer size by using lnet_ping_sts_size(). The limitation to that approach is that if the nid passed into lnet_ping_sts_size() is a smaller NID like IPv4 the buffer could be too small. Say n_ids is 4 and 3 returned NIDs are IPv4 but one is IPv6 then it can overflow. The solution is allocate maximum possible NID size. That can be done with LNET_ANY_NID which fills in all the fields. For lnet_ping_sts_size() we have to properly handle the size when using LNET_ANY_NID. If struct lnet_nid ever increasing in the future this code should still work. Also cap the maximum size of the ping buffer to avoid o2iblnd failures from using RDMA which sends data that doesn't support large NIDs. Fixes: d137e9823ca ("LU-10003 lnet: use Netlink to support LNet ping commands") Test-Parameters: trivial testlist=sanity-lnet Change-Id: I5b61add2b3701cad12074515f45773bbc9fbc583 Signed-off-by: James Simmons Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54673 Tested-by: jenkins Tested-by: Maloo Reviewed-by: Serguei Smirnov Reviewed-by: Cyril Bordage Reviewed-by: Andreas Dilger Reviewed-by: Stephane Thiell Reviewed-by: Frank Sehr Reviewed-by: Oleg Drokin --- diff --git a/lnet/include/lnet/lib-types.h b/lnet/include/lnet/lib-types.h index a02eab2..fadaead 100644 --- a/lnet/include/lnet/lib-types.h +++ b/lnet/include/lnet/lib-types.h @@ -1225,6 +1225,10 @@ lnet_ping_sts_size(const struct lnet_nid *nid) { int size; + /* for deciding the size of the ping buffer */ + if (unlikely(LNET_NID_IS_ANY(nid))) + return sizeof(struct lnet_ni_large_status); + if (nid_is_nid4(nid)) return sizeof(struct lnet_ni_status); diff --git a/lnet/lnet/api-ni.c b/lnet/lnet/api-ni.c index 9cfe617..973b615 100644 --- a/lnet/lnet/api-ni.c +++ b/lnet/lnet/api-ni.c @@ -9054,6 +9054,9 @@ lnet_ping_event_handler(struct lnet_event *event) complete(&pd->completion); } +/* Max buffer we allow to be sent. Larger values will cause IB failures */ +#define LNET_PING_BUFFER_MAX 3960 + static int lnet_ping(struct lnet_processid *id, struct lnet_nid *src_nid, signed long timeout, struct lnet_genl_ping_list *plist, int n_ids) @@ -9085,7 +9088,11 @@ static int lnet_ping(struct lnet_processid *id, struct lnet_nid *src_nid, if (id->pid == LNET_PID_ANY) id->pid = LNET_PID_LUSTRE; - id_bytes += n_ids * sizeof(struct lnet_nid); + /* Allocate maximum possible NID size */ + id_bytes += lnet_ping_sts_size(&LNET_ANY_NID) * n_ids; + if (id_bytes > LNET_PING_BUFFER_MAX) + id_bytes = LNET_PING_BUFFER_MAX; + pbuf = lnet_ping_buffer_alloc(id_bytes, GFP_NOFS); if (!pbuf) return -ENOMEM;