Whamcloud - gitweb
LU-1600 lnet: peer creation has race with shutdown
authorLiang Zhen <liang@whamcloud.com>
Thu, 5 Jul 2012 05:10:08 +0000 (13:10 +0800)
committerOleg Drokin <green@whamcloud.com>
Fri, 6 Jul 2012 16:29:21 +0000 (12:29 -0400)
commitdeb96cf039d54f37dda2520b52a5e51801160eb4
tree75f74af7d8363156687d8e0df01cd2608fb707d7
parentfa916a174d06c0d73147c62244e754ac4a93376f
LU-1600 lnet: peer creation has race with shutdown

lnet_nid2peer_locked()->lnet_find_peer_locked() will get NULL if
LNet's in progress of shutting down, then it will try to allocate
a new peer and insert it into peer table. If one thread is doing this
and another thread could have already finalized everything of LNet,
so the first thread will crash system after allocation.

The solution is add an extra refcount on peer-table (number of peers)
before allocating new peer, because the shutting down thread always
needs to wait until peers number to be zero before going to the
next step of finalization.

This bug is not introduced by new LNet, but it can be exposed
easily by new LNet.

Signed-off-by: Liang Zhen <liang@whamcloud.com>
Change-Id: I5c8d26f08ce56092bee1b4bae5111fdfe1e9c12b
Reviewed-on: http://review.whamcloud.com/3274
Reviewed-by: Bobi Jam <bobijam@whamcloud.com>
Tested-by: Hudson
Reviewed-by: Doug Oucharek <doug@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
lnet/lnet/peer.c