Whamcloud - gitweb
LU-8420 ldlm: take at_current change into account on prolong 48/21448/9
authorVladimir Saveliev <vladimir.saveliev@seagate.com>
Fri, 2 Dec 2016 00:40:40 +0000 (02:40 +0200)
committerOleg Drokin <oleg.drokin@intel.com>
Tue, 7 Feb 2017 06:12:24 +0000 (06:12 +0000)
commit18c95c436a55a2c7c8b8f71c0935e8d92c70c42f
treef8e015e79ecac67dfa44938241eee162267f4d9a
parent8c5d21639ae3628a398ff5556ed90b58d41c456b
LU-8420 ldlm: take at_current change into account on prolong

Prolong timeout is calculated based upon estimated service time. When
prolong is called after bulk transfer timeout there is a chance that
service estimate on server side was reset recently due to more time than
at_history passed since the worst rpc time.  If rpc timeout was
initially based on bigger service estimate, it may happen that prolonged
timeout will be smaller than the original one, and the lock callback
timer will not get prolonged which may result in client's eviction.

When trying to prolong lock callback timer take into account that the
worst server estimate might get reset. In that case calculate prolong
timeout based upon service estimate set by client on sending the rpc.

A test to illustrates the issue is included.

Signed-off-by: Vladimir Saveliev <vladimir.saveliev@seagate.com>
Seagate-bug-id: MRP-3582
Change-Id: I79988c8e82967d8eef077f42cd6331999294ea50
Reviewed-on: https://review.whamcloud.com/21448
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
lustre/ofd/ofd_dev.c
lustre/tests/recovery-small.sh