Whamcloud - gitweb
LU-11519 hsm: handle hsd_request_count == 0 properly 80/33580/6
authorJohn L. Hammond <jhammond@whamcloud.com>
Mon, 5 Nov 2018 17:48:55 +0000 (11:48 -0600)
committerOleg Drokin <green@whamcloud.com>
Wed, 21 Nov 2018 04:06:09 +0000 (04:06 +0000)
In mdt_cdt_waiting_cb() it may be that the coordinator has already
reached the limit of active requests and hsd contains no requests to
be started. Handle this properly when trying to prioritize a restore.

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: Ic843b7672ae6a4509ac127c2d2f90bf3681f84fc
Reviewed-on: https://review.whamcloud.com/33580
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Quentin Bouget <quentin.bouget@cea.fr>
Reviewed-by: Ben Evans <bevans@cray.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
lustre/mdt/mdt_coordinator.c

index 19fd0f5..6795c7d 100644 (file)
@@ -163,6 +163,9 @@ static int mdt_cdt_waiting_cb(const struct lu_env *env,
        int i;
 
        /* Are agents full? */
+       if (atomic_read(&cdt->cdt_request_count) >= cdt->cdt_max_requests)
+               RETURN(hsd->hsd_housekeeping ? 0 : LLOG_PROC_BREAK);
+
        if (hsd->hsd_action_count + atomic_read(&cdt->cdt_request_count) >=
            cdt->cdt_max_requests) {
                /* We cannot send any more request
@@ -224,8 +227,10 @@ static int mdt_cdt_waiting_cb(const struct lu_env *env,
 
                        /* Discard the (whole) last hal */
                        hsd->hsd_request_count--;
+                       LASSERT(hsd->hsd_request_count >= 0);
                        tmp = &hsd->hsd_request[hsd->hsd_request_count];
                        hsd->hsd_action_count -= tmp->hal->hal_count;
+                       LASSERT(hsd->hsd_action_count >= 0);
                        OBD_FREE(tmp->hal, tmp->hal_sz);
                } else {
                        /* Bailing out, this code path is too hot */