Whamcloud - gitweb
LU-12339 lnet: select LO interface for sending 40/36040/3
authorAmir Shehata <ashehata@whamcloud.com>
Sat, 25 May 2019 16:55:47 +0000 (09:55 -0700)
committerOleg Drokin <green@whamcloud.com>
Fri, 4 Oct 2019 20:31:07 +0000 (20:31 +0000)
commitbab7da820e36d3c00e888704fc2c8d6022786c42
treead21354e052db9e658fbdaee6f17536506b41cd8
parent73c8ae59cb2bd8352301d8f09ef1309adb5c8202
LU-12339 lnet: select LO interface for sending

In the following scenario

Lustre->LNetPrimaryNID with 0@lo
Discover is initiated on 0@lo
The peer is created with 0@lo and <addr>@<net>
The interface health of the peer's <addr>@<net> is decremented
LNetPut() to self
selection algorithm selects 0@lo to send to

This exposes an issue where we try and go through the peer credit
management algorithm, but because there are no credits associated with
0@lo we end up indefinitely queuing the message. ptlrpc will then get
stuck waiting for send completion on the message.

This was exposed via conf-sanity 32a

Lustre-change: https://review.whamcloud.com/34957
Lustre-commit: 69d1535ebdac139c6b19db2bca5f65663fe88467

Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I98e9d3428b594a0d041d27d8e8d8de7596825edc
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36040
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
lnet/lnet/lib-move.c