Whamcloud - gitweb
So. When we replay a request, we go through request_out_callback again,
which is called when portals informs us that our message has been sent.
That will decref the request again, and unless it's been bumped for
each resend/replay, we will prematurely free it. In addition to the
obvious evil of freeing it (which will take it off the sending_head
before we're really done with it), it also causes a deadlock when
free_req attempts to acquire req->rq_connection->c_lock -- which is
already held by the recovery replay loop!
This should make things better, and might even fix the MDS failover
test.