Whamcloud - gitweb
LU-7934 osp: fix tr->otr_next_id overflow 90/19190/3
authorAlexander Boyko <alexander.boyko@seagate.com>
Tue, 29 Mar 2016 11:39:47 +0000 (14:39 +0300)
committerOleg Drokin <oleg.drokin@intel.com>
Thu, 16 Jun 2016 22:15:09 +0000 (22:15 +0000)
commitb9e1bb635039c6d2d985754a9a029c9d5c20b569
treed074daffefab1df83333fd2bfdab7f6247393958
parentaa40787eee1835e4f84a40caa96fb232354bd799
LU-7934 osp: fix tr->otr_next_id overflow

The tracker next_id and current_id u32 type was based on
max llog records. But llog use cyclic store for records, so
llog could store infinite number of records and limited by
max number at moument of time. The u32 type could be
overflowed easy if a server isn`t rebooted.
osp_sync_id_get()) snx11126-OST0045-osc-MDT0000: next 0,
last synced 4294967205
This fix changes type u32 to u64 for *id. Now, we store only
low part current_id to llog record header id. And restore the
full u64 from record header later. It is possible because
llog catalog can store 64768^2 and it is less than u32 max value.

The patch adds test to check u32 overflow for otr_next_id field
of osp_id_tracker.

Recreate the next assertion
LustreError: 185667:0:(osp_sync.c:1544:osp_sync_id_get())
snx11126-OST0045-osc-MDT0000: next 0, last synced 4294967205
LustreError: 231396:0:(osp_sync.c:1545:osp_sync_id_get()) LBUG

Seagate-bug-id: MRP-3367
Signed-off-by: Alexander Boyko <alexander.boyko@seagate.com>
Change-Id: I89d70ecb068f8d0b5a1e1ac35b85a4b6e53211e5
Reviewed-on: http://review.whamcloud.com/19190
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@seagate.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
lustre/include/obd_support.h
lustre/osp/osp_dev.c
lustre/osp/osp_internal.h
lustre/osp/osp_sync.c
lustre/tests/replay-single.sh