Whamcloud - gitweb
LU-18174 osd-ldiskfs: do not miss readdir's actor failure 68/56168/5
authorVladimir Saveliev <vladimir.saveliev@hpe.com>
Thu, 9 Jan 2025 14:58:15 +0000 (17:58 +0300)
committerOleg Drokin <green@whamcloud.com>
Sun, 2 Feb 2025 06:24:57 +0000 (06:24 +0000)
commit8f1b4dc1d634501fd3ffb8962cb3b8be29941e41
treeb412a7e26e2f069ac038b48406691da8adad77d1
parentedabcf5ee2fdea18d55ea86cfab2b6013de1d50a
LU-18174 osd-ldiskfs: do not miss readdir's actor failure

lfsck falls into endless loop in osd_check_lmv() if osd_iget2()
returns error:
lfsck_master_engine
 lfsck_master_oit_engine
  lfsck_object_find_bottom
   lfsck_object_find_by_dev
    lu_object_find_at
     lu_object_start
      osd_object_init
       osd_fid_lookup
        osd_check_lmv      << this endlessly calls iterate_dir
         iterate_dir
          ldiskfs_readdir
           ldiskfs_dx_readdir
            call_filldir
             dir_emit
              osd_stripe_dir_filldir
               osd_iget
                osd_iget2   << this returns error

Use struct osd_check_lmv_buf to inform osd_check_lmv() about an error
in readdir callback.

Test to illustrate the issus is added.

Test-Parameters: trivial testlist=sanity-lfsck env=ONLY=43
HPE-bug-id: LUS-12365
Signed-off-by: Vladimir Saveliev <vladimir.saveliev@hpe.com>
Change-Id: I57a4b739c9ad5a8c09bdad05752714830d584595
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/56168
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
lustre/include/obd_support.h
lustre/osd-ldiskfs/osd_handler.c
lustre/tests/conf-sanity.sh
lustre/tests/sanity-lfsck.sh
lustre/tests/test-framework.sh