Whamcloud - gitweb
So. When we replay a request, we go through request_out_callback again,
authorshaver <shaver>
Wed, 9 Oct 2002 21:15:22 +0000 (21:15 +0000)
committershaver <shaver>
Wed, 9 Oct 2002 21:15:22 +0000 (21:15 +0000)
commitb9e8a2497349345adc1511422b7bed8d9f6ae111
tree129526d1a62f77377c38a81a4e6d531dc012d923
parentae52ffd71430afc35470526907ec321f47895523
So.  When we replay a request, we go through request_out_callback again,
which is called when portals informs us that our message has been sent.
That will decref the request again, and unless it's been bumped for
each resend/replay, we will prematurely free it.  In addition to the
obvious evil of freeing it (which will take it off the sending_head
before we're really done with it), it also causes a deadlock when
free_req attempts to acquire req->rq_connection->c_lock -- which is
already held by the recovery replay loop!

This should make things better, and might even fix the MDS failover
test.
lustre/ptlrpc/client.c
lustre/ptlrpc/recovd.c