Whamcloud - gitweb
LU-6084 ptlrpc: prevent request timeout grow due to recovery 20/13520/9
authorMikhail Pershin <mike.pershin@intel.com>
Tue, 3 Feb 2015 18:30:14 +0000 (10:30 -0800)
committerOleg Drokin <oleg.drokin@intel.com>
Mon, 9 Feb 2015 02:40:41 +0000 (02:40 +0000)
commit84f813bf639a7d078e19a3cf41f7c06a3824caa9
tree8e95ec04bd46f965e4235b49a91f54bd73a16b68
parenteaf686e31903057aef2ad855fb2133a285a9f08a
LU-6084 ptlrpc: prevent request timeout grow due to recovery

Patch fixes the issue with growing request timeout which occured
after commit 1d889090 for LU-5079. While commit itself is correct,
it reveals another issue. If request is being processed for a long
time on server then client adaptive timeouts will adapt to that
after receiving reply and new requests will have bigger timeout.
Another problem is that server AT history is corrupted by recovery
request processing time which not pure service time but includes
also waiting time for clients to recover.

Patch prevents the AT stats update from early replies on client and
from recovering requests processing time on server.
The ptlrpc_at_recv_early_reply() still updates the current request
timeout as asked by server, but don't include this into AT stats.
The real reply will bring that data from server after all.

Test-Parameters: alwaysuploadlogs testlist=replay-vbr,replay-dual

Signed-off-by: Mikhail Pershin <mike.pershin@intel.com>
Change-Id: Ifcadfd669162013b6ccb386eb2b508bd9f0b22d9
Reviewed-on: http://review.whamcloud.com/13520
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
lustre/ptlrpc/client.c
lustre/ptlrpc/service.c
lustre/tests/replay-dual.sh
lustre/tests/replay-vbr.sh