Whamcloud - gitweb
LU-16019 llite: fully disable readahead in kernel I/O path 93/47993/4
authorQian Yingjin <qian@ddn.com>
Wed, 20 Jul 2022 02:22:35 +0000 (22:22 -0400)
committerOleg Drokin <green@whamcloud.com>
Mon, 8 Aug 2022 19:53:57 +0000 (19:53 +0000)
In the new kernel (rhel9 or ubuntu 2204), the readahead path may
be out of the control of Lustre CLIO engine:

generic_file_read_iter()
  ->filemap_read()
    ->filemap_get_pages()
      ->page_cache_sync_readahead()
        ->page_cache_sync_ra()

void page_cache_sync_ra()
{
if (!ractl->ra->ra_pages || blk_cgroup_congested()) {
if (!ractl->file)
return;
req_count = 1;
do_forced_ra = true;
}

/* be dumb */
if (do_forced_ra) {
force_page_cache_ra(ractl, req_count);
return;
}
...
}

From the kernel readahead code, even if read-ahead is disabled
(via @ra_pages == 0), it still issues this request as read-ahead
as we will need it to satisfy the requested range. The forced
read-ahead will do the right thing and limit the read to just
the requested range, which we will set to 1 page for this case.

Thus it can not totally avoid the read-ahead in the kernel I/O
path only by setting @ra_pages with 0.
To fully disable the read-ahead in the Linux kernel I/O path, we
still need to set @io_pages to 0, it will set I/O range to 0 in
@force_page_cache_ra():
void force_page_cache_ra()
{
...
max_pages = = max_t(unsigned long, bdi->io_pages,
    ra->ra_pages);
nr_to_read = min_t(unsigned long, nr_to_read, max_pages);
while (nr_to_read) {
...
}
...
}

After set bdi->io_pages with 0, it can pass the sanity/101j.

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I859a6404abb9116d9acfa03de91e61d3536d3554
Reviewed-on: https://review.whamcloud.com/47993
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
lustre/autoconf/lustre-core.m4
lustre/llite/llite_lib.c

index bcd0294..9adaac6 100644 (file)
@@ -1943,6 +1943,27 @@ posix_acl_update_mode, [
 ]) # LC_POSIX_ACL_UPDATE_MODE
 
 #
 ]) # LC_POSIX_ACL_UPDATE_MODE
 
 #
+# LC_HAVE_BDI_IO_PAGES
+#
+# Kernel version 4.9 commit 9491ae4aade6814afcfa67f4eb3e3342c2b39750
+# mm: don't cap request size based on read-ahead setting
+# This patch introduces a bdi hint, io_pages.
+#
+AC_DEFUN([LC_HAVE_BDI_IO_PAGES], [
+LB_CHECK_COMPILE([if 'struct backing_dev_info' has 'io_pages' field],
+bdi_has_io_pages, [
+       #include <linux/backing-dev.h>
+],[
+       struct backing_dev_info info;
+
+       info.io_pages = 0;
+],[
+       AC_DEFINE(HAVE_BDI_IO_PAGES, 1,
+               [backing_dev_info has io_pages])
+])
+]) # LC_HAVE_BDI_IO_PAGES
+
+#
 # LC_IOP_GENERIC_READLINK
 #
 # Kernel version 4.10 commit dfeef68862edd7d4bafe68ef7aeb5f658ef24bb5
 # LC_IOP_GENERIC_READLINK
 #
 # Kernel version 4.10 commit dfeef68862edd7d4bafe68ef7aeb5f658ef24bb5
@@ -2822,6 +2843,7 @@ AC_DEFUN([LC_PROG_LINUX], [
        LC_GROUP_INFO_GID
        LC_VFS_SETXATTR
        LC_POSIX_ACL_UPDATE_MODE
        LC_GROUP_INFO_GID
        LC_VFS_SETXATTR
        LC_POSIX_ACL_UPDATE_MODE
+       LC_HAVE_BDI_IO_PAGES
 
        # 4.10
        LC_IOP_GENERIC_READLINK
 
        # 4.10
        LC_IOP_GENERIC_READLINK
index 94dc136..8aec4f8 100644 (file)
@@ -1348,6 +1348,9 @@ int ll_fill_super(struct super_block *sb)
 
        /* disable kernel readahead */
        sb->s_bdi->ra_pages = 0;
 
        /* disable kernel readahead */
        sb->s_bdi->ra_pages = 0;
+#ifdef HAVE_BDI_IO_PAGES
+       sb->s_bdi->io_pages = 0;
+#endif
 
        /* Call ll_debugfs_register_super() before lustre_process_log()
         * so that "llite.*.*" params can be processed correctly.
 
        /* Call ll_debugfs_register_super() before lustre_process_log()
         * so that "llite.*.*" params can be processed correctly.