Whamcloud - gitweb
LU-16830 lod: improve rr allocation 96/50996/3
authorAlexander Boyko <alexander.boyko@hpe.com>
Fri, 12 May 2023 21:32:20 +0000 (17:32 -0400)
committerOleg Drokin <green@whamcloud.com>
Fri, 9 Jun 2023 05:27:30 +0000 (05:27 +0000)
Roundrobin allocation uses atomic_inc() % ost_count for
generation OST index. When some OSTs are unavailable and
many threads make object creation, it could happen that
OST idx is the same for all attempts. For example with
4 OSTs configuration when 2 OSTs do faiover, estimation
of probability is 0.5^12=0.024%. The result is ENOSPC for
user application.

Let's try one by one OSTs for a last speed loop.

HPE-bug-id: LUS-11265
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: I325cf4ad706c9b0df64cf53792e77c1fad6f7739
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50996
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sergey Cheremencev <scherementsev@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
lustre/lod/lod_qos.c

index 4775b40..978e81c 100644 (file)
@@ -749,7 +749,7 @@ static int lod_ost_alloc_rr(const struct lu_env *env, struct lod_object *lo,
        __u32 stripe_idx = 0;
        __u32 stripe_count, stripe_count_min, ost_idx;
        int rc, speed = 0, ost_connecting = 0;
-       int stripes_per_ost = 1;
+       int idx, stripes_per_ost = 1;
        bool overstriped = false;
        ENTRY;
 
@@ -808,11 +808,13 @@ repeat_find:
                  lqr->lqr_start_count, lqr->lqr_offset_idx, osts->op_count,
                  osts->op_count);
 
-       for (i = 0; i < osts->op_count * stripes_per_ost &&
+       for (i = 0, idx = 0; i < osts->op_count * stripes_per_ost &&
                    stripe_idx < stripe_count; i++) {
-               int idx;
+               if (likely(speed < 2) || i == 0)
+                       idx = atomic_inc_return(&lqr->lqr_start_idx);
+               else
+                       idx++;
 
-               idx = atomic_inc_return(&lqr->lqr_start_idx);
                array_idx = (idx + lqr->lqr_offset_idx) %
                                osts->op_count;
                ost_idx = lqr->lqr_pool.op_array[array_idx];