Whamcloud - gitweb
LU-13358 libcfs: add timeout to cfs_race() to fix race
authorAlex Zhuravlev <bzzz@whamcloud.com>
Tue, 30 Mar 2021 05:57:14 +0000 (08:57 +0300)
committerAndreas Dilger <adilger@whamcloud.com>
Fri, 23 Sep 2022 16:41:12 +0000 (16:41 +0000)
there is no guarantee for the branches in cfs_race() to be executed
in strict order, thus it's possible that the second branch (with
cfs_race_state=1) is executed before the first branch and then another
thread executing the first branch gets stuck.

this construction is used for testing only and as a
workaround it's enough to timeout.

Lustre-change: https://review.whamcloud.com/43161
Lustre-commit: 2d2d381f35ee004319a20f5d2d8e70d13480d6c7

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Ie1cc0accedb3e1a198d4b17d1ab00ce298c560f2
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/48553
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
libcfs/include/libcfs/libcfs_fail.h

index d36e58d..9e57506 100644 (file)
@@ -167,8 +167,14 @@ static inline void cfs_race(__u32 id)
                        int rc;
                        cfs_race_state = 0;
                        CERROR("cfs_race id %x sleeping\n", id);
-                       rc = wait_event_interruptible(cfs_race_waitq,
-                                                     cfs_race_state != 0);
+                       /*
+                        * XXX: don't wait forever as there is no guarantee
+                        * that this branch is executed first. for testing
+                        * purposes this construction works good enough
+                        */
+                       rc = wait_event_interruptible_timeout(cfs_race_waitq,
+                                                     cfs_race_state != 0,
+                                                     cfs_time_seconds(5));
                        CERROR("cfs_fail_race id %x awake: rc=%d\n", id, rc);
                } else {
                        CERROR("cfs_fail_race id %x waking\n", id);