Whamcloud - gitweb
LU-14031 ptlrpc: decrease time between reconnection 38/40638/4
authorAlexander Boyko <c17825@cray.com>
Wed, 14 Oct 2020 08:20:58 +0000 (04:20 -0400)
committerOleg Drokin <green@whamcloud.com>
Thu, 4 Mar 2021 08:36:59 +0000 (08:36 +0000)
commit443655679a7976d6bb5a62b7b155ffa2e39b857a
tree54b5ef2bc075a80e6d0b8f646f1915661d935995
parent8411816d331ca9ad27215b74070e106f560b727d
LU-14031 ptlrpc: decrease time between reconnection

When a connection get a timeout or get an error reply from a sever,
the next attempt happens after PING_INTERVAL. It is equal to
obd_timeout/4. When a first reconnection fails, a second go to
failover pair. And a third connection go to a original server.
Only 3 reconnection before server evicts client base on blocking
ast timeout. Some times a first failed and the last is a bit late,
so client is evicted. It is better to try reconnect with a timeout
equal to a connection request deadline, it would increase a number
of attempts in 5 times for a large obd_timeout. For example,
    obd_timeout=200
     - [ 1597902357, CONNECTING ]
     - [ 1597902357, FULL ]
     - [ 1597902422, DISCONN ]
     - [ 1597902422, CONNECTING ]
     - [ 1597902433, DISCONN ]
     - [ 1597902473, CONNECTING ]
     - [ 1597902473, DISCONN ] <- ENODEV from a failover pair
     - [ 1597902523, CONNECTING ]
     - [ 1597902539, DISCONN ]

The patch adds a logic to wakeup pinger for failed connection request
with ETIMEDOUT or ENODEV. It adds imp_next_ping processing for
ptlrpc_pinger_main() time_to_next_wake calculation, and fixes setting
of imp_next_ping value.

Lustre-commit: de8ed5f19f04136a4addcb3f91496f26478d03e7
Lustre-change: https://review.whamcloud.com/40244

HPE-bug-id: LUS-8520
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: Ia0891a8ead1922810037f7d71092cd57c061dab9
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-by: Vitaly Fertman <vitaly.fertman@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-on: https://review.whamcloud.com/40638
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
lustre/ptlrpc/events.c
lustre/ptlrpc/import.c
lustre/ptlrpc/niobuf.c
lustre/ptlrpc/pinger.c