Whamcloud - gitweb
LU-13984 ptlrpc: throttle RPC resend if network error
When sending a callback AST to a non-responding client, the server
retries endlessly until the client is eventually evicted. When using
ksocklnd, it will retry after each AST timeout, until the socket is
eventually closed, after sock_timeout sec, where the retry will fail
immediately, returning -110, as no socket could be established.
The thread will spin on retrying and failing, until eventual client
eviction. This will cause high thread CPU usage and possible resource
denial.
To workaround that, this patch avoids re-trying callback resend if:
- the request is flagged with network error and timeout
- last try was less than 1 sec ago
In worst case, retry will happen after a timeout based on req->rq_deadline.
If there is nothing else to handle, thread will be sleeping during that
time, removing CPU overhead.
Signed-off-by: Aurelien Degremont <degremoa@amazon.com>
Change-Id: Ie5028761c978b26e833fd0a5d30d313addf57984
Reviewed-on: https://review.whamcloud.com/40020
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>