Whamcloud - gitweb
LU-16159 lod: cancel update llogs upon recovery abort
If recovery is aborted, cancel update catalog from catlist, and keep
them on disk for some time (for debug purpose), as can avoid
accumulating stale update records, and also avoid recovery problems
if update llogs are corrupt.
Update llogs are canceled after recovery completes and before regular
request processing. For these logs, their ctime will be set, and log
header will be marked with LLOG_F_MAX_AGE|LLOG_F_RM_ON_ERR, and when
30 days passed, they will be removed automatically.
Tidy up recovery abort code:
* if obd_abort_recovery is set, or OBD is stopping, stop both
client recovery and MDT recovery.
* otherwise if obd_abort_mdt_recovery is set, stop MDT recovery only.
lctl llog_print support printing update log FIDs used by specified
MDT:
* "lctl --device <MDT> llog_print update_log" will list all update
llog FIDs used by this MDT device.
Disabled replay-single.sh 100c stripe check because abort_recovery
will cancel update llogs, and won't replay them upon next recovery.
Added replay-single.sh 100d.
Formatall in the end of replay-single.sh because directory unlink may
fail.
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single,replay-single,replay-single,replay-single,replay-single,replay-single,replay-single,replay-single,replay-single
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Ie2bda6c097d65f5c51cba66c2dbf6ae4a5d36dda
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48584
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 files changed: