LU-12956 ldlm: fix hrtimer using
A race could happen between hrtimer_start() and
hrtimer_expires_remaning(), cause the second one doesn't hold a lock
on timer->base. And a first one could change it between different CPU.
The following failure happened:
BUG: unable to handle kernel NULL pointer dereference at
000000000028
IP: [<
ffffffffc0fc773f>] target_handle_connect+0x12ff/0x2b50 [ptlrpc]
at remaining = hrtimer_expires_remaining(timer), timer->base was NULL
The fix changes hrtimer_expires_remaining() to hrtimer_get_remaining()
which helds a lock and prevents race.
Fixes:
9334f1d51249 ("LU-11771 ldlm: use hrtimer for recovery to fix timeout messages")
HPE-bug-id: LUS-9514
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: I2cea1e5e2d523f131f1acb3346cf0324adae624e
Reviewed-on: https://review.whamcloud.com/40513
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>