Whamcloud - gitweb
LU-17142 mgc: reconnection without pinger
When MGS was offline for some time, AT is increased and
connection request deadline is high. Reconnect with a pinger
waits a request deadline for a next attempt. A situation is
worse with a failover partner, when different connections are used.
Reconnection could fail with local MGS too.
Here is the error when MGC could not connect to a local MGS, MDT
combined with MGS.
LustreError: 15c-8: MGC90@kfi:
Confguration from log kjlmo12-MDT0000 failed from MGS -5.
The patch forces reconnection with import invalidate and aborts
inflight requests.
ptlrpc_recover_import() aborts waiting for disconnect import state.
But disconnect happens between connection attempt and it is valid.
This is fixed.
Reset Adaptive Timeout when local MGS starts. It allows MGC to
reconnect efficiently.
mgs_barrier_gl_interpret_reply() should handle EINVAL from a client,
it means client don't have a lock.
HPE-bug-id: LUS-11633
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: Ie631e04fb3e72900af076cf7f268f20f7b285445
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52498
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Andriy Skulysh <andriy.skulysh@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>