From 142b9baeba254a81751db5e143c0788ad29e7e40 Mon Sep 17 00:00:00 2001
From: Sergey Cheremencev <scherementsev@ddn.com>
Date: Wed, 25 Sep 2024 19:27:49 +0300
Subject: [PATCH] LU-18049 mgc: fix memory corruption

Fix memory corruption in mgc_apply_recovery_logs
caused by type address mistake of struct lnet_nid.
When mne_nid_count was > 1, at 2nd iteration it
stored nid at addr+400(sizeof(lnet_nid)*sizeof(lnet_nid))
instead of next array element, i.e. addr+20.
This caused a lot of memory corruptions with different
back traces, depending on the owner of memory located
near nid array. Corruptions usually happened in kmalloc-64.
It might corrupt the data inside slab objects or slub
service structres(freepointer).

Test-Parameters: trivial testlist=sanity-sec env=ONLY=31,ONLY_REPEAT=10 serverversion=2.15
Test-Parameters: trivial testlist=sanity-sec env=ONLY=31,ONLY_REPEAT=10 serverversion=2.15
Test-Parameters: trivial testlist=sanity-sec env=ONLY=31,ONLY_REPEAT=10 serverversion=EXA6

Fixes: e4d2d4ff74 ("LU-13306 mgc: handle large NID formats")
Signed-off-by: Sergey Cheremencev <scherementsev@ddn.com>
Change-Id: I3719a09a3814f24ef26c2b118de629b42d13313c
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/56500
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
---
 lustre/mgc/mgc_request.c   | 8 ++------
 lustre/tests/sanity-sec.sh | 2 ++
 2 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/lustre/mgc/mgc_request.c b/lustre/mgc/mgc_request.c
index 9513a99..efb72a0 100644
--- a/lustre/mgc/mgc_request.c
+++ b/lustre/mgc/mgc_request.c
@@ -1280,7 +1280,6 @@ static int mgc_apply_recover_logs(struct obd_device *mgc,
 		prev_version = entry->mne_version;
 
 		if (entry->mne_nid_type == 0) {
-			struct lnet_nid *nid;
 			int i;
 
 			OBD_ALLOC_PTR_ARRAY(nidlist, entry->mne_nid_count);
@@ -1296,11 +1295,8 @@ static int mgc_apply_recover_logs(struct obd_device *mgc,
 				lustre_swab_mgs_nidtbl_entry_content(entry);
 
 			/* Turn old NID format to newer format. */
-			nid = nidlist;
-			for (i = 0; i < entry->mne_nid_count; i++) {
-				lnet_nid4_to_nid(entry->u.nids[i], nid);
-				nid += sizeof(struct lnet_nid);
-			}
+			for (i = 0; i < entry->mne_nid_count; i++)
+				lnet_nid4_to_nid(entry->u.nids[i], &nidlist[i]);
 		} else {
 			/* Handle the case if struct lnet_nid is expanded in
 			 * the future. The MGS should prevent this but just
diff --git a/lustre/tests/sanity-sec.sh b/lustre/tests/sanity-sec.sh
index d5bde69..ef841cb 100755
--- a/lustre/tests/sanity-sec.sh
+++ b/lustre/tests/sanity-sec.sh
@@ -2592,6 +2592,8 @@ test_31() {
 
 	export LNETCTL=$(which lnetctl 2> /dev/null)
 
+	(( $MDS1_VERSION >= $(version_code 2.15.0) )) ||
+		skip "Need MDS >= 2.15.0"
 	[ -z "$LNETCTL" ] && skip "without lnetctl support." && return
 	local_mode && skip "in local mode."
 
-- 
1.8.3.1