Whamcloud - gitweb
LU-16692 osp: do not assert on seq got over network 20/54020/15
authorLi Dongyang <dongyangli@ddn.com>
Tue, 13 Feb 2024 04:10:53 +0000 (15:10 +1100)
committerOleg Drokin <green@whamcloud.com>
Mon, 8 Apr 2024 15:34:58 +0000 (15:34 +0000)
commitf00d2467fc7c5ebd8a313683e039bf945a4b7094
treeea29bff29edbfdb3501ccea906c1fada58168db9
parentbb6a2d2e80f04645b488ecca6ba14cb628e3eeb3
LU-16692 osp: do not assert on seq got over network

Replay requests have FIDs already assigned and the
sequence could be different to the osp:
seq rollover happened after the original request,
then something triggers replay, or osp lost the
seq rollover record on storage.

Detect this and avoid the assert in osp_fid_diff(),
we don't update the last id on osp in this case,
otherwise orhpan cleanup could cleanup the objects
in the current osp's sequence.

Also when rollover seq happens in osp, do not
LASSERT() if we didn't get a new seq, most likely
on ofd/ost the previous seq update was lost on storage.
We could return the error code and let precreate
thread try again.

Cleanup lu_fid_diff() which is not used.
In osp_create(), do not call osp_update_last_fid()
again for the regular non-replay case, it's already
done via osp_object_assign_fid()->osp_precreate_get_fid().

Change-Id: I509c00b998933d45865c9540e12a2db7d1b2b8ed
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54020
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
lustre/include/lustre_fid.h
lustre/osp/osp_internal.h
lustre/osp/osp_object.c
lustre/osp/osp_precreate.c
lustre/tests/recovery-small.sh
lustre/tests/sanity-pfl.sh