Whamcloud - gitweb
LU-14031 ptlrpc: decrease time between reconnection
authorAlexander Boyko <c17825@cray.com>
Wed, 14 Oct 2020 08:20:58 +0000 (04:20 -0400)
committerAndreas Dilger <adilger@whamcloud.com>
Thu, 15 Jul 2021 20:43:36 +0000 (20:43 +0000)
commit0316f2ce528c3c0db1c9a52ec836dc69c8ff5c55
tree75a28b0f7e889b3fc39a24982ce6f7cc8ccfa2d0
parent36bc8abc322f4f04c34f11f960c5f79b279c763c
LU-14031 ptlrpc: decrease time between reconnection

When a connection get a timeout or get an error reply from a sever,
the next attempt happens after PING_INTERVAL. It is equal to
obd_timeout/4. When a first reconnection fails, a second go to
failover pair. And a third connection go to a original server.
Only 3 reconnection before server evicts client base on blocking
ast timeout. Some times a first failed and the last is a bit late,
so client is evicted. It is better to try reconnect with a timeout
equal to a connection request deadline, it would increase a number
of attempts in 5 times for a large obd_timeout. For example,
    obd_timeout=200
     - [ 1597902357, CONNECTING ]
     - [ 1597902357, FULL ]
     - [ 1597902422, DISCONN ]
     - [ 1597902422, CONNECTING ]
     - [ 1597902433, DISCONN ]
     - [ 1597902473, CONNECTING ]
     - [ 1597902473, DISCONN ] <- ENODEV from a failover pair
     - [ 1597902523, CONNECTING ]
     - [ 1597902539, DISCONN ]

The patch adds a logic to wakeup pinger for failed connection request
with ETIMEDOUT or ENODEV. It adds imp_next_ping processing for
ptlrpc_pinger_main() time_to_next_wake calculation, and fixes setting
of imp_next_ping value.

Lustre-commit: de8ed5f19f04136a4addcb3f91496f26478d03e7
Lustre-change: https://review.whamcloud.com/40244

HPE-bug-id: LUS-8520
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: Ia0891a8ead1922810037f7d71092cd57c061dab9
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-by: Vitaly Fertman <vitaly.fertman@hpe.com>
Reviewed-on: https://review.whamcloud.com/44251
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
lustre/ptlrpc/pinger.c