Whamcloud - gitweb
LU-7434 ptlrpc: lost bulk leads to a hang 21/17221/10
authorVitaly Fertman <vitaly.fertman@seagate.com>
Tue, 1 Mar 2016 23:46:31 +0000 (02:46 +0300)
committerOleg Drokin <oleg.drokin@intel.com>
Thu, 21 Apr 2016 02:27:54 +0000 (02:27 +0000)
commit55f8520817a31dabf19fe0a8ac2492b85d039c38
tree2df7f5b740360a53b8fea3df4af2cf1eccbbad60
parent0a338970c2c73e14cc9be65d360de85be28ca488
LU-7434 ptlrpc: lost bulk leads to a hang

The reverse order of request_out_callback() and reply_in_callback()
puts the RPC into UNREGISTERING state, which is waiting for RPC &
bulk md unlink, whereas only RPC md unlink has been called so far.
If bulk is lost, even expired_set does not check for UNREGISTERING
state.

The same for write if server returns an error.

This phase is ambiguous, split to UNREG_RPC and UNREG_BULK.

Signed-off-by: Vitaly Fertman <vitaly.fertman@seagate.com>
Change-Id: Ib1eeb1777ad1ab4c7ea1c83fe95dc9ae82c1894c
Seagate-bug-id:  MRP-2953, MRP-3206
Reviewed-by: Andriy Skulysh <andriy.skulysh@seagate.com>
Reviewed-by: Alexey Leonidovich Lyashkov <alexey.lyashkov@seagate.com>
Tested-by: Elena V. Gryaznova <elena.gryaznova@seagate.com>
Reviewed-on: http://review.whamcloud.com/17221
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
lustre/include/lustre_net.h
lustre/include/obd_support.h
lustre/ptlrpc/client.c
lustre/ptlrpc/import.c
lustre/ptlrpc/niobuf.c
lustre/target/tgt_handler.c
lustre/tests/conf-sanity.sh
lustre/tests/recovery-small.sh