Whamcloud - gitweb
LU-1600 lnet: peer creation has race with shutdown
lnet_nid2peer_locked()->lnet_find_peer_locked() will get NULL if
LNet's in progress of shutting down, then it will try to allocate
a new peer and insert it into peer table. If one thread is doing this
and another thread could have already finalized everything of LNet,
so the first thread will crash system after allocation.
The solution is add an extra refcount on peer-table (number of peers)
before allocating new peer, because the shutting down thread always
needs to wait until peers number to be zero before going to the
next step of finalization.
This bug is not introduced by new LNet, but it can be exposed
easily by new LNet.
Signed-off-by: Liang Zhen <liang@whamcloud.com>
Change-Id: I5c8d26f08ce56092bee1b4bae5111fdfe1e9c12b
Reviewed-on: http://review.whamcloud.com/3274
Reviewed-by: Bobi Jam <bobijam@whamcloud.com>
Tested-by: Hudson
Reviewed-by: Doug Oucharek <doug@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>