Whamcloud - gitweb
LU-18049 mgc: fix memory corruption 00/56500/2
authorSergey Cheremencev <scherementsev@ddn.com>
Wed, 25 Sep 2024 16:27:49 +0000 (19:27 +0300)
committerOleg Drokin <green@whamcloud.com>
Mon, 30 Sep 2024 15:37:23 +0000 (15:37 +0000)
Fix memory corruption in mgc_apply_recovery_logs
caused by type address mistake of struct lnet_nid.
When mne_nid_count was > 1, at 2nd iteration it
stored nid at addr+400(sizeof(lnet_nid)*sizeof(lnet_nid))
instead of next array element, i.e. addr+20.
This caused a lot of memory corruptions with different
back traces, depending on the owner of memory located
near nid array. Corruptions usually happened in kmalloc-64.
It might corrupt the data inside slab objects or slub
service structres(freepointer).

Test-Parameters: trivial testlist=sanity-sec env=ONLY=31,ONLY_REPEAT=10 serverversion=2.15
Test-Parameters: trivial testlist=sanity-sec env=ONLY=31,ONLY_REPEAT=10 serverversion=2.15
Test-Parameters: trivial testlist=sanity-sec env=ONLY=31,ONLY_REPEAT=10 serverversion=EXA6

Fixes: e4d2d4ff74 ("LU-13306 mgc: handle large NID formats")
Signed-off-by: Sergey Cheremencev <scherementsev@ddn.com>
Change-Id: I3719a09a3814f24ef26c2b118de629b42d13313c
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/56500
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
lustre/mgc/mgc_request.c
lustre/tests/sanity-sec.sh

index 9513a99..efb72a0 100644 (file)
@@ -1280,7 +1280,6 @@ static int mgc_apply_recover_logs(struct obd_device *mgc,
                prev_version = entry->mne_version;
 
                if (entry->mne_nid_type == 0) {
-                       struct lnet_nid *nid;
                        int i;
 
                        OBD_ALLOC_PTR_ARRAY(nidlist, entry->mne_nid_count);
@@ -1296,11 +1295,8 @@ static int mgc_apply_recover_logs(struct obd_device *mgc,
                                lustre_swab_mgs_nidtbl_entry_content(entry);
 
                        /* Turn old NID format to newer format. */
-                       nid = nidlist;
-                       for (i = 0; i < entry->mne_nid_count; i++) {
-                               lnet_nid4_to_nid(entry->u.nids[i], nid);
-                               nid += sizeof(struct lnet_nid);
-                       }
+                       for (i = 0; i < entry->mne_nid_count; i++)
+                               lnet_nid4_to_nid(entry->u.nids[i], &nidlist[i]);
                } else {
                        /* Handle the case if struct lnet_nid is expanded in
                         * the future. The MGS should prevent this but just
index d5bde69..ef841cb 100755 (executable)
@@ -2592,6 +2592,8 @@ test_31() {
 
        export LNETCTL=$(which lnetctl 2> /dev/null)
 
+       (( $MDS1_VERSION >= $(version_code 2.15.0) )) ||
+               skip "Need MDS >= 2.15.0"
        [ -z "$LNETCTL" ] && skip "without lnetctl support." && return
        local_mode && skip "in local mode."