Whamcloud - gitweb
LU-12402 lnet: handle recursion in resend 67/38367/2
authorAmir Shehata <ashehata@whamcloud.com>
Sat, 25 Apr 2020 17:19:32 +0000 (10:19 -0700)
committerOleg Drokin <green@whamcloud.com>
Fri, 1 May 2020 04:33:42 +0000 (04:33 +0000)
commit41ed1c18082435624dc5a391511a5ff40ec79979
tree6d3a793043b9c9103bd3f4d71cc8d5e3ad16d7f4
parentbb588c70baa84b0e02d4e663d8bceca3cd39ee2b
LU-12402 lnet: handle recursion in resend

When we're resending a message we have to decommit it first. This
could potentially result in another message being picked up from the
queue and sent, which could fail immediately and be finalized, causing
recursion. This problem was observed when a router was being shutdown.

This patch uses the same mechanism used in lnet_finalize() to limit
recursion. If a thread is already finalizing a message and it gets
into path where it starts finalizing a second, then that message
is queued and handled later.

Lustre-change: https://review.whamcloud.com/35431
Lustre-commit: ad9243693c9a5a5b2c34165ad853ddf5ceec4617

Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: Iace64c7ddb1f56a0a63b030df6a5ab103ae6c645
Reviewed-on: https://review.whamcloud.com/38367
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
lnet/include/lnet/lib-types.h
lnet/lnet/lib-msg.c