Whamcloud - gitweb
LU-16692 osp: do not assert on seq got over network
authorLi Dongyang <dongyangli@ddn.com>
Tue, 13 Feb 2024 04:10:53 +0000 (15:10 +1100)
committerAndreas Dilger <adilger@whamcloud.com>
Mon, 15 Apr 2024 09:54:36 +0000 (09:54 +0000)
commit55a9dfb82d747e58f6dac89da94dd2ceaad78cc3
treefa81b136b0282681d0c1e1ea513d23bec46cb596
parent483d6f2d5298e4b8a5ff9fb507947ff88ec7b21f
LU-16692 osp: do not assert on seq got over network

Replay requests have FIDs already assigned and the
sequence could be different to the osp:
seq rollover happened after the original request,
then something triggers replay, or osp lost the
seq rollover record on storage.

Detect this and avoid the assert in osp_fid_diff(),
we don't update the last id on osp in this case,
otherwise orhpan cleanup could cleanup the objects
in the current osp's sequence.

Also when rollover seq happens in osp, do not
LASSERT() if we didn't get a new seq, most likely
on ofd/ost the previous seq update was lost on storage.
We could return the error code and let precreate
thread try again.

Cleanup lu_fid_diff() which is not used.
In osp_create(), do not call osp_update_last_fid()
again for the regular non-replay case, it's already
done via osp_object_assign_fid()->osp_precreate_get_fid().

Lustre-change: https://review.whamcloud.com/54020
Lustre-commit: f00d2467fc7c5ebd8a313683e039bf945a4b7094

Change-Id: I509c00b998933d45865c9540e12a2db7d1b2b8ed
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54704
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
lustre/include/lustre_fid.h
lustre/osp/osp_internal.h
lustre/osp/osp_object.c
lustre/osp/osp_precreate.c
lustre/tests/recovery-small.sh
lustre/tests/sanity-pfl.sh