Whamcloud - gitweb
LU-11117 ptlrpc: don't zero request handle 81/32781/4
authorAlexander Boyko <c17825@cray.com>
Fri, 15 Jun 2018 09:02:36 +0000 (05:02 -0400)
committerOleg Drokin <green@whamcloud.com>
Wed, 18 Jul 2018 05:58:59 +0000 (05:58 +0000)
LNet can retransmit a request at any time if it isn't replied.
The ptlrpc_resend_req zero the request handle and ptlrpc_send_rpc
set it. If retransmission happen with zeroed handle, the client
can't find a valid export by handle and set rq_export to NULL and
reply with ENOTCONN. A server evict client with this error.

client (nid x.x.x.x@tcp) returned error from blocking AST
(req status -107 rc -107), evict it

Signed-off-by: Alexander Boyko <c17825@cray.com>
Cray-bug-id: LUS-6037
Change-Id: I198666d386fea99b46994f965c1519acb5743d75
Reviewed-on: https://review.whamcloud.com/32781
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
lustre/ptlrpc/client.c

index d0e8d78..047eea7 100644 (file)
@@ -2785,7 +2785,7 @@ void ptlrpc_cleanup_client(struct obd_import *imp)
  */
 void ptlrpc_resend_req(struct ptlrpc_request *req)
 {
-        DEBUG_REQ(D_HA, req, "going to resend");
+       DEBUG_REQ(D_HA, req, "going to resend");
        spin_lock(&req->rq_lock);
 
        /* Request got reply but linked to the import list still.
@@ -2796,14 +2796,13 @@ void ptlrpc_resend_req(struct ptlrpc_request *req)
                return;
        }
 
-        lustre_msg_set_handle(req->rq_reqmsg, &(struct lustre_handle){ 0 });
-        req->rq_status = -EAGAIN;
+       req->rq_status = -EAGAIN;
 
-        req->rq_resend = 1;
-        req->rq_net_err = 0;
-        req->rq_timedout = 0;
+       req->rq_resend = 1;
+       req->rq_net_err = 0;
+       req->rq_timedout = 0;
 
-        ptlrpc_client_wake_req(req);
+       ptlrpc_client_wake_req(req);
        spin_unlock(&req->rq_lock);
 }