Whamcloud - gitweb
LU-16271 ptlrpc: fix eviction right after recovery 57/49257/4
authorAlexander Boyko <alexander.boyko@hpe.com>
Mon, 28 Nov 2022 14:20:05 +0000 (09:20 -0500)
committerOleg Drokin <green@whamcloud.com>
Tue, 3 Jan 2023 21:32:47 +0000 (21:32 +0000)
When recovery is finished exports could be timedout since
recovery thread waits stale clients, and no more requests
come after final ping. This was handled as exports timers update
after final ping processing. LU-16002 introduced fast evictions
and brings error - eviction right after recovery.
Process exports timers updates before obd_recovering is cleared.

Fixes: 6bdeda7afe ("LU-16002 ptlrpc: reduce pinger eviction time")
Test-Parameters: testlist=replay-single env=ONLY=89,ONLY_REPEAT=20
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: Ibf3b2f632d6d3aa1de57038fdecbec38cf9a97cf
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49257
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
lustre/ldlm/ldlm_lib.c

index 2a7aa27..335c942 100644 (file)
@@ -2825,6 +2825,15 @@ static int target_recovery_thread(void *arg)
                OBP(obd, iocontrol)(OBD_IOC_LLOG_CANCEL, obd->obd_self_export,
                                    0, NULL, NULL);
 
+       list_for_each_entry(req, &obd->obd_final_req_queue, rq_list) {
+               /*
+                * Because the waiting client can not send ping to server,
+                * so we need refresh the last_request_time, to avoid the
+                * export is being evicted
+                */
+               ptlrpc_update_export_timer(req->rq_export, 0);
+       }
+
        /*
         * We drop recoverying flag to forward all new requests
         * to regular mds_handle() since now
@@ -2843,12 +2852,6 @@ static int target_recovery_thread(void *arg)
                          libcfs_nidstr(&req->rq_peer.nid));
                handle_recovery_req(thread, req,
                                    trd->trd_recovery_handler);
-               /*
-                * Because the waiting client can not send ping to server,
-                * so we need refresh the last_request_time, to avoid the
-                * export is being evicted
-                */
-               ptlrpc_update_export_timer(req->rq_export, 0);
                target_request_copy_put(req);
        }