Whamcloud - gitweb
fs/lustre-release.git
5 weeks agoLU-18793 gss: coverity issue in prepare_krb5_rfc4121_buffer 71/58371/2
Sebastien Buisson [Tue, 11 Mar 2025 16:06:19 +0000 (17:06 +0100)]
LU-18793 gss: coverity issue in prepare_krb5_rfc4121_buffer

Fix issue found by Coverity in prepare_krb5_rfc4121_buffer():

CoverityID 457079:    (RESOURCE_LEAK)
   Variable "derived_key" going out of scope leaks the storage
   "derived_key.data" points to.

Fixes: c7cf297687 ("LU-18256 gss: deprecate insecure enctypes")
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I1a2924555814ca6ce643b9e7cad217a7f6725765
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58371
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Aurelien Degremont <adegremont@nvidia.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 weeks agoLU-17000 mgs: check lcfg parameters 60/58360/7
Andreas Dilger [Tue, 11 Mar 2025 00:17:59 +0000 (18:17 -0600)]
LU-17000 mgs: check lcfg parameters

Check lcfg parameters in mgs_iocontrol() and mgs_iocontrol_pool().
Make the callers more consistent with buffer access and error checks.
REC_DATA_LEN() should be used instead of "rec_len" for lcfg size,
since "rec_len" is the full llog record size including hdr/tail.

CoverityID: 397130 ("Passing tainted expression lcfg->lcfg_buflens")
CoverityID: 397132 ("Passing tainted expression lcfg->lcfg_buflens")
CoverityID: 425252 ("Untrusted value as argument")

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I9a7052a24d77124582df61340e520c1db6433892
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58360
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 weeks agoLU-18791 lmv: report an overflowed nlink correctly 57/58357/9
coreytesdahl [Wed, 4 Dec 2024 22:37:06 +0000 (16:37 -0600)]
LU-18791 lmv: report an overflowed nlink correctly

inode nlink count should be set to 1 if any component
stripe in stripped dir has overflowed and been set to 1

HPE-bug-id: LUS-12747
Signed-off-by: Corey Tesdahl <corey.tesdahl@hpe.com>
Change-Id: I2a3b6f5bd846d11d768ee1979b7f11f0e7cf1c88
Signed-off-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58357
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
5 weeks agoLU-16518 mgc: remove mne_swab from mgc_process_nodemap_log() 50/58350/3
Timothy Day [Sat, 8 Mar 2025 21:53:33 +0000 (16:53 -0500)]
LU-16518 mgc: remove mne_swab from mgc_process_nodemap_log()

mne_swab is no longer used and can be removed.

Fixes: 41610e6207ef ("LU-8837 mgc: move server-only code out of mgc_request.c")
Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I86f966ba03fcfcec7d91952cc126c1a008c5e26f
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58350
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 weeks agoLU-18758 tests: test lfs migrate --block on changing file 99/58299/5
Feng Lei [Wed, 5 Mar 2025 02:51:09 +0000 (10:51 +0800)]
LU-18758 tests: test lfs migrate --block on changing file

Run 'lfs migrate --block' on a continously changing file
to check wether '--block' option is broken.

Test-Parameters: trivial env="ONLY=56xab,ONLY_REPEAT=500"
Signed-off-by: Lei Feng <flei@whamcloud.com>
Change-Id: I48492b6296f81d0cbe3bc83fad2ffeb855dff1aa
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58299
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 weeks agoLU-18748 class: print statfs_state for OBD devices 59/58259/21
Andreas Dilger [Thu, 27 Feb 2025 02:36:01 +0000 (19:36 -0700)]
LU-18748 class: print statfs_state for OBD devices

In addition to the regular os_filesfree, os_kbytesfree, etc.
parameters available for each OBD device, add os_state as
a "statfs_state" parameter that contain a single character
fir each of the OS_STATE_* flags that indicate an uncommon
aspec of the target, so that it can be checked.

This allows checking if an OST is read-only ('R' character),
out of blocks ('B'), or inodes ('I'), degraded ('D'),
or is flash-based (non-rotational) storage decoce ('f').

Remove old procfs wrappers for printing statfs data, since
they are no longer used in any parts of the code.

Modify sanity test_56c() to verify that the new statfs_state
is matching the state printed by "lfs df -v".  Allow this
subtest to be run on flash devices that already have 'f' set.
Remove hard-coded waits. Clean up properly in case of errors.

Fix up the code style of sanity test_79 to match current rules.

Allow restore_lustre_params() to accept a filename argument, and
remove it after restore, rather than needing this in all callers.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I0462acdb7b9b7ec76792c9ff50206af2ae2540e5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58259
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
5 weeks agoLU-18720 quota: check for project quota 66/58066/8
Alex Zhuravlev [Thu, 13 Feb 2025 13:27:14 +0000 (16:27 +0300)]
LU-18720 quota: check for project quota

lqi_ignore_root_proj_quota should be checked only with project quota,
otherwise it's state is undefined and can lead to undefined results.

Fixes: 2686838fef ("LU-18240 sec: enforce per-nodemap project quota for root")
Test-Parameters: testlist=sanity-quota,sanity-quota,sanity-quota
Test-Parameters: fstype=zfs testlist=sanity-quota,sanity-quota,sanity-quota
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I02896639ed375ddd314fa44dc9cddb008b1dbb0b
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58066
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Sergey Cheremencev <scherementsev@ddn.com>
5 weeks agoLU-18690 tests: sanity-quota/79 should return 0 47/57947/4
Alex Zhuravlev [Mon, 3 Feb 2025 03:25:56 +0000 (06:25 +0300)]
LU-18690 tests: sanity-quota/79 should return 0

the test should not return a result of background process trying
to access pool's information.

Test-Parameters: trivial
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Ib9446b222c3be721e9ec5e9a4da67f249ac46aa0
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57947
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sergey Cheremencev <scherementsev@ddn.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 weeks agoLU-18524 utils: Improve lctl nodemap_modify interface 48/57448/8
Marc Vef [Sun, 15 Dec 2024 18:21:13 +0000 (19:21 +0100)]
LU-18524 utils: Improve lctl nodemap_modify interface

"lctl nodemap_modify" uses two separate parameters for the property
name and the corresponding value. This patch slightly improves this
interface by allowing "=value" as part of --property similar to "lctl
set_param" commands:

lctl nodemap_modify --name nm_0 --property admin=1

instead of (which is still allowed):

lctl nodemap_modify --name nm_0 --property admin --value 1

sanity-sec fileset_test_setup() was modified to exercise this
interface.

Test-Parameters: trivial testlist=sanity-sec
Signed-off-by: Marc Vef <mvef@whamcloud.com>
Change-Id: I030e1602a67c066e6daf4187589a77c1b49ee855
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57448
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
5 weeks agoLU-18454 utils: 'lfs migrate' can read filenames from file 04/57104/9
Feng Lei [Fri, 22 Nov 2024 01:18:53 +0000 (09:18 +0800)]
LU-18454 utils: 'lfs migrate' can read filenames from file

Enhance 'lfs migrate' and 'lfs mirror extend' command to
be able to read filenames from a file or pipeline and handle
all the files in one process.

When -0 or --null is specified, read filenames from stdin by
default. Each filename is ended with a NUL char. So it can work
together with 'lfs find -0' very well. For example:

  # lfs find /mnt/lustre --ost 0 -0 | lfs migrate -0 --ost 1

When --files-from=LISTFILE is specified, read filenames from
LISTFILE. One line for each filename. If LISTFILE is -, read
from stdin. If --null is specified at the same time, filenames
are separated by NUL char in LISTFILE.

Filenames can be specified on command line directly as before.

Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: trivial
Change-Id: Id12aaba2b52a8a541a4d552d37facf6fb0fadc57
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57104
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 weeks agoLU-18462 nodemap: sanity checks for positional parameters 87/57087/14
Sebastien Buisson [Wed, 20 Nov 2024 14:31:15 +0000 (15:31 +0100)]
LU-18462 nodemap: sanity checks for positional parameters

A number of nodemap commands just take positional parameters instead
of options:
- nodemap_activate
- nodemap_add
- nodemap_del
- nodemap_test_nid
In this case, we need to check manually that we got the correct number
of arguments, and return a proper error message otherwise.

These commands are also modified to grok named options, in addition to
positional parameters that we will have to keep for some time for
backward compatibility reasons.

Test-Parameters: trivial testlist=sanity-sec
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Ie515e8e03f30581c7fb84efa3f881769fc02a397
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57087
Reviewed-by: Marc Vef <mvef@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
5 weeks agoLU-18114 llog: split "lctl llog_*" group into subcommands 40/56040/8
Emoly Liu [Wed, 11 Dec 2024 09:23:38 +0000 (17:23 +0800)]
LU-18114 llog: split "lctl llog_*" group into subcommands

Split "lctl llog_*" command group into subcommands, e.g.
"lctl llog_info" to "lctl llog info".
Also, conf-sanity.sh test_123ad is modified to verify
this patch.

Signed-off-by: Emoly Liu <emoly@whamcloud.com>
Change-Id: Iea1cbfac61ff807d793a7a48dafa73915b06639d
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/56040
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Zhenyu Xu <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
5 weeks agoLU-17843 build: mount.lustre_tgt as symlink 79/55079/5
Patrick Farrell [Sat, 11 May 2024 19:10:51 +0000 (15:10 -0400)]
LU-17843 build: mount.lustre_tgt as symlink

mount.lustre_tgt was always intended to be a symlink to the
main mount.lustre, but instead it's an identical build item.

This results in this rpmbuild warning (an error on some systems):
    Duplicate build-ids .../mount.lustre and .../mount.lustre_tgt

This patch resolves this by making it a proper symlink as was
originally intended.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <patrick.farrell@oracle.com>
Change-Id: Id5f90df94fcc5c73a93fd6c1311b09af429c1440
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55079
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
5 weeks agoLU-18792 tests: fix sanity-hsm test_26e for interop testing 73/58373/3
Etienne AUJAMES [Tue, 11 Mar 2025 15:51:02 +0000 (16:51 +0100)]
LU-18792 tests: fix sanity-hsm test_26e for interop testing

sanity-hsm test_26e fails in interop with 2.15.x version.
The test needs to be skipped for interop testing.

Test-Parameters: trivial
Test-Parameters: testlist=sanity serverversion=2.15.6 env=ONLY=26e
Fixes: 241cf3c6d0 ("LU-16235 hsm: get a valid cookie for RAoLU request")
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Change-Id: I56bafcfb2617637a88e328995aefb6ffa9785b37
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58373
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Guillaume Courrier <guillaume.courrier@cea.fr>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 weeks agoLU-18528 test: wait for quota pool nr 09/58409/2
Hongchao Zhang [Fri, 14 Mar 2025 08:54:56 +0000 (16:54 +0800)]
LU-18528 test: wait for quota pool nr

In test_68 in sanity_quota, the number of the quota pool in QMT
could be delayed to update, then it should wait for its update.

Test-Parameters: trivial testlist=sanity-quota fstype=zfs env=ONLY=68,ONLY_REPEAT=50
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Change-Id: I2e1964cf39a493d68ddd6463aaa28cf173951979
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58409
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 weeks agoLU-17950 ldiskfs: race in ext4_inode_attach_jinode 81/58381/4
Li Dongyang [Wed, 12 Mar 2025 09:28:53 +0000 (20:28 +1100)]
LU-17950 ldiskfs: race in ext4_inode_attach_jinode

A race condition could happen when multiple threads
trying to attach jinode for the same inode:

Thread 1:
ext4_map_blocks
  ext4_inode_attach_jinode
    spin_lock(&inode->i_lock)
    ei->jinode = jinode
->
    jbd2_journal_init_jbd_inode(ei->jinode, inode)

Thread 2:
ext4_map_blocks
  ext4_inode_attach_jinode
    if (ei->jinode || !EXT4_SB(inode->i_sb)->s_journal)
    return 0;
  ext4_jbd2_inode_add_write
->  jbd2_journal_file_inode

The problem is in ext4_inode_attach_jinode() the initial check
of ei->jinode is not protected by inode->i_lock,
thread 2 could go ahead and use the not yet initialized jinode
in jbd2_journal_file_inode(), and thread 1 later will
use jbd2_journal_init_jbd_inode, corrupting the jinode.

Note this issue is specific to ldiskfs because of
ext4-attach-jinode-in-writepages.patch added
ext4_inode_attach_jinode() to make sure jinode is initialized
before calling ext4_jbd2_inode_add_write().

Change-Id: Iafd7aa9537505afbf4bc53fef40ea3aa0a94b7da
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58381
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 weeks agoLU-18779 lnet: lnetctl SIGSEGV in lnetctl.c getopt_internal() 22/58322/7
Frank Sehr [Thu, 6 Mar 2025 20:19:38 +0000 (12:19 -0800)]
LU-18779 lnet: lnetctl SIGSEGV in lnetctl.c getopt_internal()

Variable optindex was out of range. The whole check could be
simplified (only check optarg) if no negative values for verbose
 are expected. Also modified for peer.

Test-Parameters: trivial
Signed-off-by: Frank Sehr <fsehr@whamcloud.com>
Change-Id: I64aad7527377b098479e93040a84b0865b02de28
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58322
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Manish Regmi <mregmi@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
5 weeks agoLU-18676 tests: random write to set file size 83/58083/9
Hongchao Zhang [Sat, 15 Mar 2025 19:14:58 +0000 (03:14 +0800)]
LU-18676 tests: random write to set file size

The sanity-quota.sh test_49 is using "createmany -S 4k"
to set the size of a new file instead of writing all the
actual file data.  Add a new "-W SIZE" option to write
the specified number of random bytes instead of only
writing a few bytes at the end of the file.

This avoids issues with sparse files or data compression
resulting in less space being allocated than expected.

Test-Parameters: testlist=sanity-quota fstype=zfs env=ONLY=49,ONLY_REPEAT=50
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Change-Id: Ida93da881b48e6fdd85b64e90991b85f28de63d4
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58083
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
5 weeks agoLU-17777 tests: Exclude Files when comparing dir structure 29/58129/8
Arshad Hussain [Wed, 19 Feb 2025 11:04:37 +0000 (16:34 +0530)]
LU-17777 tests: Exclude Files when comparing dir structure

This patch Excludes /etc/yum* and /etc/pki* as corner case
which is updated by RHEL asynchronously breaking the test-case.

Moves search folder from "/etc /bin" to "/etc /usr/bin" as
/bin can be a symlink

Removes arbitrary return code of 18 and 22 with $?. For
diff this should be 1 for mismatch and 2 for files not found

Renames constant error message with unique message when
rebuilding dir structure before and after remount

Test-Parameters: trivial testlist=runtests
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I8f7824b930f6286e7e5744ff403a02cec280075d
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58129
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Deiter <adeiter@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
6 weeks agoNew tag 2.16.53 2.16.53 v2_16_53
Oleg Drokin [Wed, 19 Mar 2025 23:38:23 +0000 (19:38 -0400)]
New tag 2.16.53

Change-Id: I4a6d3cff8b78d64660d2848d02bbd0f624ea4e7e
Signed-off-by: Oleg Drokin <green@whamcloud.com>
6 weeks agoLU-18740 mgs: size_t in contain_valid_fsname() 42/58142/3
Alex Zhuravlev [Fri, 21 Feb 2025 03:11:06 +0000 (06:11 +0300)]
LU-18740 mgs: size_t in contain_valid_fsname()

to fix a build warning with gcc 11.5.0 (Rocky 9.3):

lustre/mgs/mgs_llog.c:4995:13: error: '__builtin_memcmp_eq'
 specified bound [18446744071562067968, 0] exceeds maximum
 object size 9223372036854775807 [-Werror=stringop-overread]
 4995 |         if (memcmp(buf, fsname, namelen) != 0)
      |             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~

Test-Parameters: trivial
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I77adc19e4d79d4a84a2cfe3c9601f5536ad8cc81
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58142
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
6 weeks agoLU-18794 kernel: update RHEL 9.5 [5.14.0-503.31.1.el9_5] 78/58378/2
Jian Yu [Wed, 12 Mar 2025 07:36:46 +0000 (00:36 -0700)]
LU-18794 kernel: update RHEL 9.5 [5.14.0-503.31.1.el9_5]

Update RHEL 9.5 kernel to 5.14.0-503.31.1.el9_5.

Test-Parameters: trivial fstype=ldiskfs mdtcount=4 mdscount=2 \
  clientdistro=el9.5 serverdistro=el9.4 testlist=sanity

Test-Parameters: trivial fstype=ldiskfs mdtcount=4 mdscount=2 \
  clientdistro=el9.4 serverdistro=el9.5 testlist=sanity

Test-Parameters: optional fstype=ldiskfs mdtcount=4 mdscount=2 \
  clientdistro=el9.5 serverdistro=el9.5 testgroup=full-dne-part-1

Test-Parameters: optional fstype=ldiskfs mdtcount=4 mdscount=2 \
  clientdistro=el9.5 serverdistro=el9.5 testgroup=full-dne-part-2

Test-Parameters: optional fstype=ldiskfs mdtcount=4 mdscount=2 \
  clientdistro=el9.5 serverdistro=el9.5 testgroup=full-dne-part-3

Test-Parameters: optional fstype=zfs mdtcount=4 mdscount=2 \
  clientdistro=el9.5 serverdistro=el9.5 testgroup=full-dne-zfs-part-1

Test-Parameters: optional fstype=zfs mdtcount=4 mdscount=2 \
  clientdistro=el9.5 serverdistro=el9.5 testgroup=full-dne-zfs-part-2

Test-Parameters: optional fstype=zfs mdtcount=4 mdscount=2 \
  clientdistro=el9.5 serverdistro=el9.5 testgroup=full-dne-zfs-part-3

Change-Id: Ie6ec03efef1ec6f5c2d165a0e0ac6c3d3a4fd54c
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58378
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Deiter <adeiter@ddn.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
6 weeks agoLU-18795 kernel: update RHEL 8.10 [4.18.0-553.44.1.el8_10] 77/58377/2
Jian Yu [Wed, 12 Mar 2025 07:31:46 +0000 (00:31 -0700)]
LU-18795 kernel: update RHEL 8.10 [4.18.0-553.44.1.el8_10]

Update RHEL 8.10 kernel to 4.18.0-553.44.1.el8_10.

Test-Parameters: trivial fstype=ldiskfs mdtcount=4 mdscount=2 \
  clientdistro=el8.10 serverdistro=el8.10 testlist=sanity

Test-Parameters: optional fstype=zfs mdtcount=4 mdscount=2 \
  env=SANITY_EXCEPT="66 413" \
  clientdistro=el8.10 serverdistro=el8.10 testlist=sanity

Test-Parameters: optional fstype=ldiskfs mdtcount=4 mdscount=2 \
  clientdistro=el8.10 serverdistro=el8.10 testgroup=full-dne-part-1

Test-Parameters: optional fstype=ldiskfs mdtcount=4 mdscount=2 \
  clientdistro=el8.10 serverdistro=el8.10 testgroup=full-dne-part-2

Test-Parameters: optional fstype=ldiskfs mdtcount=4 mdscount=2 \
  clientdistro=el8.10 serverdistro=el8.10 testgroup=full-dne-part-3

Test-Parameters: optional fstype=zfs mdtcount=4 mdscount=2 \
  clientdistro=el8.10 serverdistro=el8.10 testgroup=full-dne-zfs-part-1

Test-Parameters: optional fstype=zfs mdtcount=4 mdscount=2 \
  clientdistro=el8.10 serverdistro=el8.10 testgroup=full-dne-zfs-part-2

Test-Parameters: optional fstype=zfs mdtcount=4 mdscount=2 \
  clientdistro=el8.10 serverdistro=el8.10 testgroup=full-dne-zfs-part-3

Change-Id: I9bd1f29d006c9da858c941ae81352c75a332a36f
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58377
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Deiter <adeiter@ddn.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
6 weeks agoLU-17000 misc: avoid memory leaks in error handling 62/58362/3
Andreas Dilger [Tue, 11 Mar 2025 02:53:19 +0000 (20:53 -0600)]
LU-17000 misc: avoid memory leaks in error handling

Fix wrong GOTO() label "out_free:" instead of "out_record_free:".

Quiet false positive for leak in krb5_make_checksum(). Since
"req == NULL" is never returned by cfs_crypto_hash_init(), then
cfs_crypto_hash_final() is always called. Coverity is confused.

CoverityID: 440607 ("Resource leak")
CoverityID: 457047 ("Resource leak")

Test-Parameters: trivial
Fixes: 11eef3f735 ("LU-10499 pcc: get PCC state for file without opening")
Fixes: 553d93361d ("LU-8602 gss: get rid of cfs_crypto_hash_desc")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I725921ad89534b8ff2d8bcd526fceca3fcd90d04
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58362
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
6 weeks agoLU-17000 llite: fix memory leaks in error handling 61/58361/2
Andreas Dilger [Tue, 11 Mar 2025 01:39:58 +0000 (19:39 -0600)]
LU-17000 llite: fix memory leaks in error handling

Ensure that allocations are freed before returning in case of errors.

CoverityID: 457069 ("Resource leak")
CoverityID: 457073 ("Resource leak")
CoverityID: 457077 ("Resource leak")

Test-Parameters: trivial
Fixes: ae828cd3b0 ("LU-4684 llite: add lock for dir layout data")
Fixes: ed4a625d88 ("LU-13717 sec: filename encryption - digest support")
Fixes: 2e2b16c28b ("LU-11025 dne: support directory restripe")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I5ff33a7243e1f536e5308f61451f205f232540e5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58361
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
6 weeks agoLU-18789 ldiskfs: ldiskfs patch adjustments for ubuntu 24.04.2 55/58355/3
Shuichi Ihara [Mon, 10 Mar 2025 01:35:53 +0000 (10:35 +0900)]
LU-18789 ldiskfs: ldiskfs patch adjustments for ubuntu 24.04.2

Ubuntu24.04.2 is based on linux-6.11.0 by default.
Only a few ldiskfs patch adjustments are needed for it
to build server modules properly.

Test-Parameters: trivial
Signed-off-by: Shuichi Ihara <sihara@ddn.com>
Change-Id: Ie476ef12568b8ecb94df38b48b51646dc42923da
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58355
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
6 weeks agoLU-16518 llite: remove unused ll_default_lmv_inherited() 52/58352/2
Timothy Day [Sat, 8 Mar 2025 22:08:56 +0000 (17:08 -0500)]
LU-16518 llite: remove unused ll_default_lmv_inherited()

This function stopped being used in a previous patch, but it
was never removed. So let's remove it now.

Fixes: 388a185eace0 ("LU-15971 llite: implicit default LMV inherit")
Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I7aef4ad1a08bf55abd6ec2cb906b4198dc3185f0
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58352
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
6 weeks agoLU-16518 lod: remove unused dt_object_qos_mkdir() 51/58351/2
Timothy Day [Sat, 8 Mar 2025 22:03:30 +0000 (17:03 -0500)]
LU-16518 lod: remove unused dt_object_qos_mkdir()

This function was rendered obsolete in a previous
patch, but was not removed.

Fixes: c1d0a355a6a6 ("LU-12624 lod: alloc dir stripes by QoS")
Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I33ce5a5b745bff7414df8aa04ecf72d68cf8f715
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58351
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
6 weeks agoLU-18785 build: ofd, ptlrpc missing prototypes 42/58342/3
Shaun Tancheff [Sat, 8 Mar 2025 03:51:03 +0000 (10:51 +0700)]
LU-18785 build: ofd, ptlrpc missing prototypes

nodemap_range.c:271:6: error: no previous prototype
  for '__range_delete' [-Werror=missing-prototypes]
  271 | void __range_delete(struct nodemap_range_tree *nm_range_tree,
      |      ^~~~~~~~~~~~~~

ofd_oss.c:430:5: error: no previous prototype for 'oss_mod_init'
  [-Werror=missing-prototypes]
  430 | int oss_mod_init(void)
      |     ^~~~~~~~~~~~

Test-Parameters: trivial
Fixes: 0ea23e01945 ("LU-13307 nodemap: have nodemap_add_member support large NIDs")
Fixes: b84f014d733 ("LU-14291 ofd: merge ost module into ofd")
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Ie846dfae7ceb511318ab4ccd9494a633129c2c4d
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58342
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
6 weeks agoLU-18784 dkms: add systemd check for dkms-debs 41/58341/3
Shaun Tancheff [Sat, 8 Mar 2025 01:44:51 +0000 (08:44 +0700)]
LU-18784 dkms: add systemd check for dkms-debs

The lustre-client-utils packaging of:
  /usr/lib/systemd/system/lnet.service
is conditional upon the presence of systemd.

Include the check when building the dkms-debs target.

Test-Parameters: trivial testgroup=full-dkms
HPE-bug-id: LUS-12776
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I583338cda8fd49cbb845ed71bb2cb34a1db3cc74
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58341
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Caleb Carlson <caleb.carlson@hpe.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
6 weeks agoLU-18687 compat: move generic-radix-tree to lustre_compat 48/58148/2
Timothy Day [Fri, 21 Feb 2025 17:20:06 +0000 (17:20 +0000)]
LU-18687 compat: move generic-radix-tree to lustre_compat

Migrate the backported radix tree code to lustre_compat.

Eventually, all of the Lustre/LNet compatability code
will live in lustre_compat - maintaining a clear
separation from the functional code in Lustre and LNet.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Iaf6fed877b23829be948f1347d21e1ff7b9ce5a9
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58148
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
6 weeks agoLU-18687 compat: move glob to lustre_compat 45/58145/2
Timothy Day [Fri, 21 Feb 2025 15:55:18 +0000 (15:55 +0000)]
LU-18687 compat: move glob to lustre_compat

Migrate the backported glob code to lustre_compat.

Eventually, all of the Lustre/LNet compatability code
will live in lustre_compat - maintaining a clear
separation from the functional code in Lustre and LNet.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I7e57326a0ed10225e2ee866071ea7c3d259d29d4
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58145
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
6 weeks agoLU-18728 tests: use urandom to really consume ZFS space 15/58115/8
Bruno Faccini [Tue, 18 Feb 2025 17:36:12 +0000 (18:36 +0100)]
LU-18728 tests: use urandom to really consume ZFS space

It appears newer ZFS is using data compression by default, so reading
from /dev/zero results in files not consuming the expected amount of
space.  Instead, read from /dev/urandom for ZFS to write files in
sanity and conf-sanity to ensure they fill the OSTs, or the image
to be used for target creation, as expected.

Test-Parameters: testgroup=review-zfs env=ZFS_MKFS_OPTS="compression=on"
Signed-off-by: Bruno Faccini <bfaccini@nvidia.com>
Change-Id: I7b4e95032608d8db82c75e4b6dd1ec5beb6f8d99
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58115
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
6 weeks agoLU-18694 sec: nodemap local root user capabilities 66/57966/13
Sebastien Buisson [Fri, 24 Jan 2025 15:31:37 +0000 (16:31 +0100)]
LU-18694 sec: nodemap local root user capabilities

Add a new 'local_admin' rbac role, on by default. The purpose of this
new role is to keep capabilities for root even if it is mapped or
offset. This allows to have root mapped to a non-privileged storage id
while still being able to perform 'admin-like' tasks thanks to
capabilities, such as changing file permissions or file ownership.

Note that setquota and changing project id is also impacted by the
local_admin role. When enabled, root on the client that gets mapped on
file system side is still able to interact with those.

Be aware that if root is squashed, then capabilities are dropped as
for any other regular user.

New test sanity-sec test_64h exercises the local_admin role.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I5832b21106b2829134a596c2aacf04839be856e9
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57966
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Marc Vef <mvef@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
6 weeks agoLU-18662 tests: skip fstrim on unsupported devices 51/57851/19
Alex Zhuravlev [Wed, 22 Jan 2025 06:00:31 +0000 (09:00 +0300)]
LU-18662 tests: skip fstrim on unsupported devices

if an underlying device doesn't support fstrim, then never try it
again.

Fixes: 6872cf9a36 ("LU-17722 tests: trim tmpfs from wait_delete_completed()")
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Ie7e49800ed0161c968e453a531b9701f3459a318
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57851
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
6 weeks agoLU-18643 tests: Do not create subdirectory on client mount 07/57807/11
Marc Vef [Thu, 16 Jan 2025 18:03:14 +0000 (19:03 +0100)]
LU-18643 tests: Do not create subdirectory on client mount

When calling zconf_mount_clients() with a FILESET, the Lustre client
is mounted at the corresponding subdirectory specified by the FILESET.

However, when the subdirectory does not exist, it automatically
creates it transparently. This may hide bugs, e.g., when a test needs
to verify that mounting against a non-existing directory is not
possible.

This patch removes the silent creation of the directory on client
mount, so that it is the caller's responsibility that the subdirectory
exists before mounting. Currently, no tests are relying on this
functionality as they already create the subdirectory themselves.
Therefore no test needs to be modified.

Test-Parameters: mdtcount=4 mdscount=2 env=ONLY="247 413" testlist=sanity
Signed-off-by: Marc Vef <mvef@whamcloud.com>
Change-Id: I900c4bff79e6b5bde541eb4e852e42cde01820e3
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57807
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
6 weeks agoLU-16565 mdc: Remove ldlm is,set,clear macros 47/57547/2
Timothy Day [Fri, 20 Dec 2024 04:14:59 +0000 (23:14 -0500)]
LU-16565 mdc: Remove ldlm is,set,clear macros

Replaces ldlm_{is,set,clear} macros with the direct flag
names.

The patch has been generated with the coccinelle script in
contrib/cocci/ldlm_flags.cocci.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I03dfce5398c17201e4f18b3c9792daab751fa8e6
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57547
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
6 weeks agoLU-8066 nodemap: migrate to debugfs 01/57401/6
James Simmons [Fri, 28 Feb 2025 14:26:23 +0000 (09:26 -0500)]
LU-8066 nodemap: migrate to debugfs

The nodemap interface in proc is for administration purposes only
so we can safely move it to debugfs.

Test-Parameters: trivial testlist=sanity-sec
Change-Id: I0797bb79896ae5d9fa3bf9088b97b10505762565
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57401
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
6 weeks agoLU-930 man: add proper documentation for replace_nids command 70/57270/6
Artem Blagodarenko [Tue, 3 Dec 2024 23:22:22 +0000 (18:22 -0500)]
LU-930 man: add proper documentation for replace_nids command

The current entry in the lctl.8 man page and manual entry are totally
lacking in explanation of what the various NIDs mean.
It should explain the behaviour of failover NIDs.

A separate "lctl replace_nids" man page was created and some
additional information added.

Test-Parameters: trivial
Signed-off-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Change-Id: I35e8fa26c109811a7411a73cd40ad811c2256e1b
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57270
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frederick Dilger <fdilger@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
6 weeks agoLU-8066 exports: move procfs exports to debugfs 13/57013/12
James Simmons [Tue, 25 Feb 2025 23:08:12 +0000 (18:08 -0500)]
LU-8066 exports: move procfs exports to debugfs

The server side has a exports proc directory with several entries.
Upstreaming requires Lustre not to use the proc directory so
we can move the exports directory to debugfs. This is server side
so the root only issue should be limited. This step will make
more of the stats Netlink work much easier.

Change-Id: I73e38813f049cf563cdc7e277e4fadecd5a94e98
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57013
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
6 weeks agoLU-11850 obd: support the rest of "stats" with Netlink 05/57305/3
James Simmons [Wed, 26 Feb 2025 18:47:19 +0000 (11:47 -0700)]
LU-11850 obd: support the rest of "stats" with Netlink

Migrate the remaining "stats" files to the debugfs Netlink API
except for the exports stats. Its is possible that we lack
a kobject and a debugfs dentry so we can end up in a case
that we can't derive a name. So change the API to supply
the stat source name instead. Update the stats packet size
calculate based on the new debugging info in the function
lnet_genl_parse_list().

Test-Parameters: trivial
Change-Id: If52dfb2807cbdcd9a24e9334edfa2101a8483fdd
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57305
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
6 weeks agoLU-11077 utils: --client option for set_param 58/55858/20
Frederick Dilger [Wed, 12 Jun 2024 20:43:08 +0000 (16:43 -0400)]
LU-11077 utils: --client option for set_param

Added new [--client|-C[FSNAME]] option for 'lctl set_param' which
writes the parameter to the local /etc/lustre/mount.client.params
config file. Upon each Lustre client mount those parameters will
be set on the local node. If FSNAME was provided, the parameters
will be saved in the mount-specific /etc/lustre/mount.FSNAME.params
config file, and will be set (and override) the more generic client
mount parameters on that node when that filesystem is mounted.
However only parameters containing FSNAME can be set to their
respective params config file to avoid generic parameters that are
only supposed to affect a single filesystem, actually affecting all
of them.

Can be used together with [--delete|-d] to remove the parameter from
the given log file. If [--delete|-d] is specified without -C or -P
it will enable -P by default. A warning message will be printed when
this happens so users are aware of what's going on.

Signed-off-by: Frederick Dilger <fdilger@whamcloud.com>
Change-Id: Iec0b9bfb5e259154ed2439e6e505b826a888905f
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55858
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
6 weeks agoLU-11850 lov: migrate completely to lu_tgt_descs API 59/51959/42
James Simmons [Wed, 5 Mar 2025 19:50:53 +0000 (14:50 -0500)]
LU-11850 lov: migrate completely to lu_tgt_descs API

The lov target handling was written before the generic lu_tgt
was written. Migrate to this newer API so lov can be treated
the same like lmv and lod. With the changes we have the new
lov_foreach_tgt() macro that tranverses all the registered
targets of total amount ltd->ltd_tgts_size. Also lov_tgt()
was created to extract a target by its index. Internally
a bitmap is used to tell if the tgt has been setup and
ltd->ltd_tgts_size defines the largest possible index.

Another change is that since that largest OST offset that
a striped lustre file can have is 65503 we reduce the
largest index possible for an OST since the last OSTs
could never be used.

Fixes: 1a6ef725c2 ("LU-16938 utils: setstripe overstripe multiple OST count")
Change-Id: If3f53b2a4589f93a024fa026ba377e2175282c29
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51959
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Feng Lei <flei@ddn.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
6 weeks agoLU-18776 mdt: prevent multiple data discard calls 02/58302/3
Mikhail Pershin [Wed, 5 Mar 2025 14:47:37 +0000 (17:47 +0300)]
LU-18776 mdt: prevent multiple data discard calls

The mdt_dom_discard_data() might be called multiple times
for the same object. That creates cyclical locks for no
reason and moreover their callbacks are executed in the
same thread recursively causing stack overflow

Patch introduces mdt_object flag mot_discard_done to
indicate that data discard was initiated once and no
need for another one.
Additionally patch don't allow to use the same thread
for lock callback if ldlm_is_ast_discard_data() is true

Fixes: 291ac6e692 ("LU-17078 ldlm: do not spin up thread for local cancels")
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I7dc5d0da93a38e04267e007f5132ddb20788f18f
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58302
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
6 weeks agoLU-18769 lnet: lnetctl memory corruption because of buffer overflow 88/58288/4
Manish Regmi [Mon, 3 Mar 2025 23:22:00 +0000 (15:22 -0800)]
LU-18769 lnet: lnetctl memory corruption because of buffer overflow

Sometimes the the user passed name is larger than the size of
lnet_dlc_intf_descr.intf_name. Add proper validation checks before
strncpy and strcpy so that the buffer does not overflow.

Test-Parameters: trivial
Signed-off-by: Manish Regmi <mregmi@ddn.com>
Change-Id: Ifa867cd60ded64fcefe0a6b948f34e9f542e6e04
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58288
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
6 weeks agoLU-18448 llite: read dir on open 69/57069/16
Alexey Lyashkov [Fri, 15 Nov 2024 09:16:04 +0000 (12:16 +0300)]
LU-18448 llite: read dir on open

Let's read some pages at directory start,
a clients needs it probably.

walk over ~100k directories with 150 files on last leaf.

readdir on open enabled.

    real    0m39.977s
    user    0m0.121s
    sys     0m7.161s

readdir on open disabled

    real    1m18.106s
    user    0m0.151s
    sys     0m15.666s

HPE-bug-id: LUS-7695
Signed-off-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Change-Id: Iaa674ce0d2e5723b380d7ca09407b27a90bc37f5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57069
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
6 weeks agoLU-18177 lustre: use enum cl_attr_valid instead of unsigned 82/56182/4
Bobi Jam [Wed, 28 Aug 2024 14:38:59 +0000 (22:38 +0800)]
LU-18177 lustre: use enum cl_attr_valid instead of unsigned

The last parameter of coo_attr_update() should be enum cl_attr_valid
instead of __u32/unsigned int

Test-Parameters: trivial
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I1e02f1f3621d82d5e279f6d37571ea43929f083e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/56182
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
7 weeks agoLU-18773 osc: initialize index_orig in osc_brw_prep_request 96/58296/2
Jian Yu [Tue, 4 Mar 2025 22:08:41 +0000 (14:08 -0800)]
LU-18773 osc: initialize index_orig in osc_brw_prep_request

This patch initializes index_orig in osc_brw_prep_request() to
fix the following error:

  In function 'osc_brw_prep_request':
  error: 'index_orig' may be used uninitialized in this function
  [-Werror=maybe-uninitialized]
       brwpg->pg->index = index_orig;
       ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~

Change-Id: I97188ea21adfa25950814a04e4f6ffdb9b763712
Test-Parameters: trivial
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58296
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
7 weeks agoLU-9633 llite: Add kernel doc style for lustre/llite/*.c (Part 7) 79/58279/2
Arshad Hussain [Mon, 3 Mar 2025 07:29:33 +0000 (12:59 +0530)]
LU-9633 llite: Add kernel doc style for lustre/llite/*.c (Part 7)

This patch converts existing functional comments
to kernel doc style comments and removes '/**' for
comments which is not meant to be a kernel-doc comment

./kernel-doc -v -none lustre/llite/symlink.c lustre/llite/rw26.c
llite/symlink.c:244: info: Scanning doc for function ll_getattr_link
llite/rw26.c:36: info: Scanning doc for function ll_invalidate_folio
llite/rw26.c:92: info: Scanning doc for function ll_invalidatepage
llite/rw26.c:719: info: Scanning doc for function ll_prepare_partial_page

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I9c9fd4c5c1edc426df42165c11c54fdd694bf722
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58279
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Anjus George <georgea@ornl.gov>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 weeks agoLU-9633 llite: Add kernel doc style for lustre/llite/*.c (Part 6) 78/58278/2
Arshad Hussain [Thu, 27 Feb 2025 15:00:54 +0000 (20:30 +0530)]
LU-9633 llite: Add kernel doc style for lustre/llite/*.c (Part 6)

This patch converts existing functional comments
to kernel doc style comments and removes '/**' for
comments which is not meant to be a kernel-doc comment

Tested:
./kernel-doc -v -none lustre/llite/llite_mmap.c lustre/llite/llite_nfs.c
lustre/llite/llite_mmap.c:72: info: Scanning doc for function ll_fault_io_init
lustre/llite/llite_mmap.c:266: info: Scanning doc for function ll_fault0
lustre/llite/llite_mmap.c:536: info: Scanning doc for function ll_vm_open
lustre/llite/llite_mmap.c:560: info: Scanning doc for function ll_vm_close
lustre/llite/llite_nfs.c:234: info: Scanning doc for function ll_encode_fh

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I31cc93b570db31550aa3bdc919dbd8ce82ce47a4
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58278
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Anjus George <georgea@ornl.gov>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 weeks agoLU-9633 llite: Add kernel doc style for lustre/llite/*.c (Part 5) 77/58277/2
Arshad Hussain [Mon, 3 Mar 2025 07:09:42 +0000 (12:39 +0530)]
LU-9633 llite: Add kernel doc style for lustre/llite/*.c (Part 5)

This patch converts existing functional comments
to kernel doc style comments and removes '/**' for
comments which is not meant to be a kernel-doc comment

Tested:
./kernel-doc -v -none lustre/llite/namei.c lustre/llite/lproc_llite.c
llite/namei.c:101: info: Scanning doc for function ll_iget
llite/namei.c:1299: info: Scanning doc for function ll_atomic_open
llite/lproc_llite.c:83: info: Scanning doc for function ll_stats_pid_write
llite/lproc_llite.c:1355: info: Scanning doc for function default_easize_show
llite/lproc_llite.c:1383: info: Scanning doc for function default_easize_store

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I8178ca5c2605341f13e307ef5e194f2b4ba8a5bd
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58277
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Anjus George <georgea@ornl.gov>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 weeks agoLU-18760 dkms: race on clobber and create of modules.order 61/58261/3
Shaun Tancheff [Fri, 28 Feb 2025 03:28:52 +0000 (10:28 +0700)]
LU-18760 dkms: race on clobber and create of modules.order

DKMS builds fail occasionally with an error:

cat: /var/lib/dkms/.../build//modules.order: No such file or directory
  MODPOST /var/lib/dkms/.../build/Module.symvers

This appears to be a make bug trying where a path with //
is not understood correctly.

Remove the unnecessary injection of / in the list of SUBDIRS
to be built in when ldiskfs is not enabled.

Test-Parameters: trivial
HPE-bug-id: LUS-12672
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I6dda02133115076b076e6adf2ebabd10895af643
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58261
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Caleb Carlson <caleb.carlson@hpe.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 weeks agoLU-18687 compat: move xarray to lustre_compat 14/58114/4
Timothy Day [Wed, 12 Feb 2025 01:35:30 +0000 (20:35 -0500)]
LU-18687 compat: move xarray to lustre_compat

Migrate the backported xarray code to lustre_compat.
Along the way, create the needed build infrastructure
for lustre_compat. Currently, lustre_compat is built
into libcfs.ko.

Eventually, all of the Lustre/LNet compatability code
will live in lustre_compat - maintaining a clear
separation from the functional code in Lustre and LNet.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I74249d0b5714bee3549bf42a8fede3f279bc37ee
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58114
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 weeks agoLU-18691 quota: quota interop check for 64k page clients 61/57961/6
Shaun Tancheff [Fri, 7 Feb 2025 13:21:53 +0000 (20:21 +0700)]
LU-18691 quota: quota interop check for 64k page clients

When hitting the end of available quota a race condition can be hit
which allows an 64k unaligned I/O to be submitted and causes the
node to hang indefinitely.

This happens when a partial write hits quota limits and a subsequent
write is not aligned on 64k page boundary triggering a hang due to
64k vs 4k page aligned transfers.

HPE-bug-id: LUS-12724
Test-Parameters: testlist=sanity-quota clientarch=ppc64le clientdistro=el8.9 serverdistro=el9.4 env=ONLY=88,ONLY_REPEAT=10
Test-Parameters: testlist=sanity-quota clientarch=ppc64le clientdistro=el8.9 serverdistro=el8.9 serverversion=2.15.4 env=ONLY=88,ONLY_REPEAT=10
Test-Parameters: testlist=sanity-quota clientarch=aarch64 clientdistro=el9.3 serverdistro=el8.10 env=ONLY=88,ONLY_REPEAT=10
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I0f8638062f8b0e57207695c45e1fccbd7492c32d
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57961
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: xinliang <xinliang.liu@linaro.org>
Reviewed-by: Sergey Cheremencev <scherementsev@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 weeks agoLU-16565 mdt: Remove ldlm is,set,clear macros 45/57545/2
Timothy Day [Fri, 20 Dec 2024 04:14:11 +0000 (23:14 -0500)]
LU-16565 mdt: Remove ldlm is,set,clear macros

Replaces ldlm_{is,set,clear} macros with the direct flag
names.

The patch has been generated with the coccinelle script in
contrib/cocci/ldlm_flags.cocci.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I866c23748a176e8cc4e391e5111d1133caf2988f
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57545
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 weeks agoLU-16565 osc: Remove ldlm is,set,clear macros 44/57544/2
Timothy Day [Fri, 20 Dec 2024 04:13:50 +0000 (23:13 -0500)]
LU-16565 osc: Remove ldlm is,set,clear macros

Replaces ldlm_{is,set,clear} macros with the direct flag
names.

The patch has been generated with the coccinelle script in
contrib/cocci/ldlm_flags.cocci.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I40f627f5dbaa072b4a1e91adbca1c88d727fb84d
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57544
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 weeks agoLU-16565 quota: Remove ldlm is,set,clear macros 43/57543/2
Timothy Day [Fri, 20 Dec 2024 04:13:09 +0000 (23:13 -0500)]
LU-16565 quota: Remove ldlm is,set,clear macros

Replaces ldlm_{is,set,clear} macros with the direct flag
names.

The patch has been generated with the coccinelle script in
contrib/cocci/ldlm_flags.cocci.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I364cb11cbbfc00f133e1193204a920233b3a1b37
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57543
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 weeks agoLU-16565 target: Remove ldlm is,set,clear macros 42/57542/3
Timothy Day [Fri, 20 Dec 2024 04:12:17 +0000 (23:12 -0500)]
LU-16565 target: Remove ldlm is,set,clear macros

Replaces ldlm_{is,set,clear} macros with the direct flag
names.

The patch has been generated with the coccinelle script in
contrib/cocci/ldlm_flags.cocci.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I63198e3278d9be930c768b64ffdccc9cd1e74a76
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57542
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 weeks agoLU-12885 mds: add enums for MDS_OPEN flags (4/4) 12/56812/12
Arshad Hussain [Mon, 28 Oct 2024 04:56:56 +0000 (00:56 -0400)]
LU-12885 mds: add enums for MDS_OPEN flags (4/4)

This patch is fourth of the series of patch that separates
kernel open flags from MDS open flags

This patch adds function ll_kernel_to_mds_open_flags() for
one place convert of kernel flags (fmode) to MDS flags

This patch removes macros O_LOV_DELAY_CREATE_1_8 and
O_LOV_DELAY_CREATE_MASK everywhere as it is was only
required for interop with applications written for Lustre
1.8 clients and not used any more

This patch adds function ll_lov_delay_create_is_set() and
ll_lov_delay_create_clear() to set and remove O_LOV_DELAY_CREATE
flag if found in struct file->fmode

This patch removes remaining fmode to mds_open_flags wherever
it was remaining

Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: Ic125dc0c7fa54888fddf435c117de9d304ea8708
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/56812
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 weeks agoLU-16796 ptlrpc: Change struct ptlrpc_reply_state to use kref 64/56364/3
Arshad Hussain [Wed, 11 Sep 2024 07:17:19 +0000 (03:17 -0400)]
LU-16796 ptlrpc: Change struct ptlrpc_reply_state to use kref

This patch changes struct ptlrpc_reply_state to use
kref instead of atomic_t

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I15d4982e709fe420b1fade4108581fbc7669058e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/56364
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 weeks agoLU-16796 lfsck: Change lfsck_instance to use refcount_t 95/56195/2
Arshad Hussain [Thu, 29 Aug 2024 09:34:01 +0000 (05:34 -0400)]
LU-16796 lfsck: Change lfsck_instance to use refcount_t

This patch changes struct lfsck_instance to use
refcount_t instead of atomic_t

Test-Parameters: trivial testlist=sanity-lfsck
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I9bf3337ed7b68dbd44e723bf7c1374a8e3a07eb7
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/56195
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 weeks agoLU-16796 ptlrpc: Change struct ptlrpc_request_set to use kref 59/53459/13
Arshad Hussain [Thu, 14 Dec 2023 11:26:11 +0000 (16:56 +0530)]
LU-16796 ptlrpc: Change struct ptlrpc_request_set to use kref

This patch changes struct ptlrpc_request_set to use
kref instead of atomic_t

Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: Icd8dc9d532121b9087455b951a1b7ee922ab532c
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53459
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Timothy Day <timday@amazon.com>
7 weeks agoLU-6142 gss: SPDX for GSS 22/57822/3
Timothy Day [Sun, 19 Jan 2025 20:05:13 +0000 (15:05 -0500)]
LU-6142 gss: SPDX for GSS

Convert from verbose license text to SPDX. These files are
largely derived from in-kernel sunrpc GSS code, which derived
it from the Kerberos project.

I've tracked down each file in upstream Linux - and either
applied the correct license to the Lustre version or omitted
the SPDX (as the kernel does) along with an updated full-path
to the Linux kernel file.

Also, add the BSD-3-Clause license text.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I19283cb8d3d625842984b9112014cc58a5a04726
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57822
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-18762 lnet: lst SIGSEGV in parser.c 69/58269/2
Frank Sehr [Fri, 28 Feb 2025 21:49:59 +0000 (13:49 -0800)]
LU-18762 lnet: lst SIGSEGV in parser.c

A null pointer problem. Null pointer was passed instead of command
structure array. Built in additional chack.

Test-Parameters: trivial
Signed-off-by: Frank Sehr <fsehr@whamcloud.com>
Change-Id: I7d4e44170aeb8e44c55de681b6b3def781a0b1bd
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58269
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Manish Regmi <mregmi@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-18753 socklnd: remove unused ksocknal_find_peer() 63/58263/2
Timothy Day [Fri, 28 Feb 2025 06:23:27 +0000 (01:23 -0500)]
LU-18753 socklnd: remove unused ksocknal_find_peer()

Remove ksocknal_find_peer() since it is not called
anywhere.

Fixes: 0d816af574b7 ("LU-11300 lnet: remove lnd_query interface.")
Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Ia9d15882260bf25ebf92d60239f666a6cc97d04a
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58263
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Aurelien Degremont <adegremont@nvidia.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-18754 build: explicitly include openssl/rand.h 31/58231/2
Shaun Tancheff [Wed, 26 Feb 2025 12:04:15 +0000 (19:04 +0700)]
LU-18754 build: explicitly include openssl/rand.h

el10 build fails with:
   error: implicit declaration of function 'RAND_bytes'

Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Ieb1b75fbf7029b712addf9222d412d3bfa91e59e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58231
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Caleb Carlson <caleb.carlson@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-18753 mdt: remove mdt_buf code 28/58228/2
Timothy Day [Wed, 26 Feb 2025 05:49:41 +0000 (00:49 -0500)]
LU-18753 mdt: remove mdt_buf code

Remove unused MDT buffer code, including mdt_buf()
and mdt_buf_const().

Fixes: 26b823865997 ("LU-3105 osd: remove capa related stuff from servers")
Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I7c0ffc820e94886902a1dbf09e01d70761f7d8fd
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58228
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Aurelien Degremont <adegremont@nvidia.com>
Reviewed-by: Max Wang <wamax@amazon.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-18753 obdclass: remove obsolete dt functions 27/58227/3
Timothy Day [Wed, 26 Feb 2025 05:37:57 +0000 (00:37 -0500)]
LU-18753 obdclass: remove obsolete dt functions

Remove:

dt_path_parser()
dt_store_resolve()
dt_store_open()
dt_reg_open()
dt_find_entry()
typedef dt_entry_func_t

These are not called anywhere.

Fixes: 29e98f581ab6 ("LU-2886 obdclass: remove obsoleted md_local_file.c")
Fixes: 29adfde10ff2 ("LU-2886 mdd: create local files using local_storage lib")
Fixes: 90d8e7fd2874 ("Land b_head_interop_disk  on HEAD (20081119_1314)")
Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I5a79f665104c0526db1e328c0e682ce85b592028
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58227
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Aurelien Degremont <adegremont@nvidia.com>
Reviewed-by: Max Wang <wamax@amazon.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-18753 ptlrpc: remove unused stub functions 26/58226/2
Timothy Day [Wed, 26 Feb 2025 05:14:49 +0000 (00:14 -0500)]
LU-18753 ptlrpc: remove unused stub functions

Remove:

ptlrpc_ping_import_soon()
flavor_copy()
ptlrpc_cleanup_client()
__lustre_swab_buf()

They are not called anywhere.

Fixes: 86b2211e55dc ("LU-290 Reconnects are not throttled")
Fixes: 3565394baa95 ("LU-3289 gss: Add userspace support for GSS null and sk")
Fixes: 3ee0e0908f12 ("LU-5829 ptlrpc: remove unnecessary EXPORT_SYMBOL")
Fixes: 23fad25a5b6b ("b=18631")
Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Ib45aba0b76d086b0a657bb3fc79d1ec74b1e3302
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58226
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Aurelien Degremont <adegremont@nvidia.com>
Reviewed-by: Max Wang <wamax@amazon.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-18753 ptlrpc: remove ptlrpcd_add_rqset() 25/58225/2
Timothy Day [Wed, 26 Feb 2025 05:07:50 +0000 (00:07 -0500)]
LU-18753 ptlrpc: remove ptlrpcd_add_rqset()

ptlrpcd_add_rqset() has no callers. Remove it.

Fixes: 03f537c50b76 ("LU-2244 lov: remove unused bits from lov, osc")
Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Ica55c7559c0244af12bf1b47b350f8aeb0398f03
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58225
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Aurelien Degremont <adegremont@nvidia.com>
Reviewed-by: Max Wang <wamax@amazon.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-18751 lnet: Segfault in lnetctl fault command 16/58216/2
Frank Sehr [Wed, 26 Feb 2025 00:24:17 +0000 (16:24 -0800)]
LU-18751 lnet: Segfault in lnetctl fault command

"lnetctl fault reset 0" and similar variations cause a segfault. This
is caused by a null pointer that is not checked in the code.

Test-Parameters: trivial
Signed-off-by: Frank Sehr <fsehr@whamcloud.com>
Change-Id: Iec580b19f97c2a189ae8f29444bf3e3cc91d78a0
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58216
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Manish Regmi <mregmi@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-17778 tests: fix conf-sanity/76d issues 15/58215/2
Andreas Dilger [Sat, 11 Jan 2025 00:16:00 +0000 (17:16 -0700)]
LU-17778 tests: fix conf-sanity/76d issues

There was a race between remounting a client and the MGS parameter
settings being applied, so ensure the parameter setting is rechecked
if it is not correct the first time.

Also, the parameter checking for $MOUNT2 was not necessarily checking
the right instance of the parameter on the client.  Use the instance
name for the mountpoint to ensure this is checked correctly.

Test-Parameters: trivial
Fixes: fa1bff8f6f ("LU-9399 llite: register mountpoint before process llog")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ib6acdc80b880dfc90b0c10406a0f868211433f58
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58215
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Nushafreen Palsetia <npalsetia@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-18749 socklnd: check page for zerocopy 05/58205/2
Yang Sheng [Mon, 10 Feb 2025 19:35:20 +0000 (03:35 +0800)]
LU-18749 socklnd: check page for zerocopy

We should check the page state to ensure kernel
can handle it in zerocopy case.

Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: Ib82989bcca9898ecc176ddc0c9a6cd4eafbad89f
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58205
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-18354 tests: speed up sanity/136 on non-ZFS 99/58199/3
Andreas Dilger [Mon, 24 Feb 2025 20:52:37 +0000 (13:52 -0700)]
LU-18354 tests: speed up sanity/136 on non-ZFS

The previous change to sanity test_136 to improve test reliability
on ZFS servers resulted in the test time increasing by about 8x
(from ~300s to ~2400s).  Only wait for deletion and drop caches on
ZFS MDS nodes, and not on ldiskfs where this is not needed.

Test-Parameters: trivial
Test-Parameters: testlist=sanity env=ONLY=136,SLOW=yes,ONLY_MINUTES=30 fstype=zfs
Test-Parameters: testlist=sanity env=ONLY=136,SLOW=yes,ONLY_MINUTES=30
Fixes: 627cc62369 ("LU-18354 tests: avoid sanity/136 OOM on ZFS servers")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ic5dc79f9b7e6c2df50a97d0447ef3aa9d3c73e1d
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58199
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Timothy Day <timday@amazon.com>
2 months agoLU-18744 tests: fix sanity-sec test_27ab 87/58187/3
Sebastien Buisson [Mon, 24 Feb 2025 12:05:54 +0000 (13:05 +0100)]
LU-18744 tests: fix sanity-sec test_27ab

When nodemap is activated in sanity-sec test_27ab, we need to make
sure the default nodemap grants root access, so that clients and
servers can be stopped and restarted.
Also fix an incorrect call to 'lctl nodemap_add_idmap'.

Test-Parameters: trivial
Fixes: e3051ad0f1 ("LU-18109 utils: adding nodemap offset capability")
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I0adafae67c7637c616c687590bd01ff12f4d6bf2
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58187
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-16518 mdt: fix unused-but-set-variable warnings 84/58184/2
Timothy Day [Mon, 24 Feb 2025 06:38:13 +0000 (01:38 -0500)]
LU-16518 mdt: fix unused-but-set-variable warnings

Remove unused variables in various parts of the MDT code.
This silences the Clang compiler -Wunused-but-set-variable
warning.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I2c42e9ce86ac854a49f8b12f5325ce1f34b8ecc3
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58184
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Max Wang <wamax@amazon.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-16518 target: fix unused-but-set-variable warnings 83/58183/2
Timothy Day [Mon, 24 Feb 2025 06:21:23 +0000 (01:21 -0500)]
LU-16518 target: fix unused-but-set-variable warnings

In tgt_checksum_niobuf(), remove the unused err return code
of cfs_crypto_hash_final() to silence a Clang compiler
warning.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I6dfe664479d4430c4386d2ff50644e41d91a4c28
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58183
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Max Wang <wamax@amazon.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 months agoLU-16518 osc: fix unused-but-set-variable warnings 82/58182/2
Timothy Day [Mon, 24 Feb 2025 06:18:11 +0000 (01:18 -0500)]
LU-16518 osc: fix unused-but-set-variable warnings

When CONFIG_CRC_T10DIF=n and osc_checksum_bulk_t10pi() is
a macro, Clang generates compiler warnings for some of the
arguments - since they are not used elsewhere. Silence this
by creating a proper function.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I502dcf1764602711fcf2cf3553ad6d2f4fed3f14
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58182
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Max Wang <wamax@amazon.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 months agoLU-16518 lnet: fix unused-but-set-variable warnings 81/58181/2
Timothy Day [Mon, 24 Feb 2025 06:12:08 +0000 (01:12 -0500)]
LU-16518 lnet: fix unused-but-set-variable warnings

Remove unused primary_nid variable in lnet_peer_merge_data()
to silence a Clang compiler warning.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Ib66fd31c7acc08fa66578cd7ab571f278f98afe1
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58181
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Max Wang <wamax@amazon.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 months agoLU-16518 ldlm: fix unused-but-set-variable warnings 80/58180/2
Timothy Day [Mon, 24 Feb 2025 01:53:27 +0000 (20:53 -0500)]
LU-16518 ldlm: fix unused-but-set-variable warnings

In ldlm_flock_completion_ast(), obd* is not being
used. Remove it to silence the Clang compiler warning.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Idc8d0fd9a351b4bb328a0956eabd2026460cdfe1
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58180
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Max Wang <wamax@amazon.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 months agoLU-16518 obdclass: fix unused-but-set-variable warnings 79/58179/2
Timothy Day [Mon, 24 Feb 2025 01:50:28 +0000 (20:50 -0500)]
LU-16518 obdclass: fix unused-but-set-variable warnings

Remove swab variable in class_config_yaml_output()
that is not being used.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I0db780c98fe34cef91988fc3f8c2ccac9481de2c
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58179
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Max Wang <wamax@amazon.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 months agoLU-16518 llite: fix unused-but-set-variable warnings 78/58178/2
Timothy Day [Mon, 24 Feb 2025 01:47:17 +0000 (20:47 -0500)]
LU-16518 llite: fix unused-but-set-variable warnings

Remove unused variables in llite. Clang does not like
this, so disard them.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Icb555135fc5ee53c2a7b2819beed4a78fe89b91d
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58178
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Max Wang <wamax@amazon.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-16518 pcc: fix unused-but-set-variable warnings 77/58177/2
Timothy Day [Mon, 24 Feb 2025 01:25:30 +0000 (20:25 -0500)]
LU-16518 pcc: fix unused-but-set-variable warnings

In several places, pcc_file is set but never used.
Clang doesn't like this, so discard this variable.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I7fde9d49bd6309335e6f9083ba588fc86d495b1c
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58177
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Max Wang <wamax@amazon.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 months agoLU-17995 obdclass: remove obdname2fsname() 74/58174/2
Timothy Day [Sun, 23 Feb 2025 18:26:59 +0000 (13:26 -0500)]
LU-17995 obdclass: remove obdname2fsname()

This function is not called anywhere.

Test-Parameters: trivial
Test-Parameters: trivial fstype=zfs
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Ib6c92787685564e812634c8a466b9edc27ba6977
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58174
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Max Wang <wamax@amazon.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-17995 osd-zfs: remove server_name_is_ost() 63/58163/2
Timothy Day [Sat, 22 Feb 2025 17:54:10 +0000 (12:54 -0500)]
LU-17995 osd-zfs: remove server_name_is_ost()

We can determine whether a server name is for an
OST by checking the return code of server_name2index().

Test-Parameters: trivial
Test-Parameters: trivial fstype=zfs
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I2e1c337b9095d333772f87bd2a5253966b54bd45
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58163
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-18516 quota: use wait_woken for qsd_op_begin0() 56/58156/3
James Simmons [Sat, 22 Feb 2025 12:56:17 +0000 (07:56 -0500)]
LU-18516 quota: use wait_woken for qsd_op_begin0()

Kernels with debugging enabled report for Lustre quota handling:

do not call blocking ops when !TASK_RUNNING;
? __might_sleep+0x9d/0xc0
  down_read_nested+0x2e/0x4b0
  lquota_disk_read+0x46e/0x800 [lquota]
  qsd_refresh_usage+0x105/0x3d0 [lquota]
  qsd_acquire+0xbe/0x7c0 [lquota]
  qsd_op_begin0+0x5f8/0xc80 [lquota]

This is due to qsd_acquire() performing operations that can sleep while
the kthread is in an idle state. The Linux kernel solution for this
is wait_woken(). Move the function qsd_op_begin0() from using
wait_event_idle_timeout() to wait_woken(). This will resolve the
potential sleeping issues.

Test-Parameters: trivial testlist=sanity-quota
Change-Id: Id2b7a5886869bf0ed3d560e159524dcda841d8b0
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58156
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Sergey Cheremencev <scherementsev@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-8289 utils: fix ll_decode_linkea doc 28/58128/2
Sohei Koyama [Wed, 19 Feb 2025 07:24:42 +0000 (16:24 +0900)]
LU-8289 utils: fix ll_decode_linkea doc

Arguments "#123451" and "#123452" in example were hidden by "\".

Test-Parameters: trivial
Signed-off-by: Sohei Koyama <skoyama@ddn.com>
Change-Id: I059abac920fcc5ecfe03eddc00fef1dd6d89db27
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58128
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Frederick Dilger <fdilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-9633 llite: Add kernel doc style for lustre/llite/*.c (Part 4) 00/58100/2
Arshad Hussain [Mon, 17 Feb 2025 11:11:26 +0000 (16:41 +0530)]
LU-9633 llite: Add kernel doc style for lustre/llite/*.c (Part 4)

This patch converts existing functional comments
to kernel doc style comments and removes '/**' for
comments which is not meant to be a kernel-doc comment

Tested with:
<kernel src path>/scrips/kernel-doc -v -none <file>

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I1b0eae8e684d96b843cd5da15d6ed2ef944ad9d2
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58100
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Anjus George <georgea@ornl.gov>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-9633 llite: Add kernel doc style for lustre/llite/*.c (Part 3) 98/58098/2
Arshad Hussain [Mon, 17 Feb 2025 08:15:42 +0000 (13:45 +0530)]
LU-9633 llite: Add kernel doc style for lustre/llite/*.c (Part 3)

This patch converts existing functional comments
to kernel doc style comments and removes '/**' for
comments which is not meant to be a kernel-doc comment

Tested with:
<kernel src path>/scrips/kernel-doc -v -none <file>

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I360e4d93d161e17172095b638cbf3628791c35a6
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58098
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Anjus George <georgea@ornl.gov>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-18723 hsm: sanity-hsm 500 hung in llapi_hsm_copytool_recv 84/58084/5
Sebastien Buisson [Fri, 14 Feb 2025 09:16:56 +0000 (17:16 +0800)]
LU-18723 hsm: sanity-hsm 500 hung in llapi_hsm_copytool_recv

sanity-hsm hung in test_500 in llapi_hsm_test test100.
The bug can be reproduced by the following test script:
ONLY="411 500" REFORMAT=yes ./sanity-hsm.sh

The reason is that the previous test case 411 does not cleanup
clearly and failed to unregister the HSM agent due to the
permission under the active rbac role and return -EPERM:
mdt_hsm_ct_unregister() {
...
if (!mdt_hsm_is_admin(info))
GOTO(out, rc = -EPERM);
...

This bug can easily be solved by making sure nodemap is always removed
before the copytool is cleaned up.

Test-Parameters: trivial testlist=sanity-hsm
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I093775eeaf39b4d2671e3a05e41f33a9e1d8ec5e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58084
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: Robert Read <rread@ddn.com>
Reviewed-by: Robert Read <rread@ddn.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-18705 pcc: only reset file mapping for valid cached file 99/57999/3
Qian Yingjin [Thu, 6 Feb 2025 12:41:59 +0000 (20:41 +0800)]
LU-18705 pcc: only reset file mapping for valid cached file

It should only reset and revise file mapping for the valid cached
file (@cached == true) in pcc_file_mapping_reset().

Otherwise, it will cause sanity-pcc test_97 panic as follows:
(pcc.c:3077:pcc_vma_file_reset())
ASSERTION( vma->vm_file->f_mapping == inode->i_mapping ) failed:
panic+0x114/0x2f6
lbug_with_loc.cold+0x30/0x69 [libcfs]
pcc_mmap_io_init+0xafe/0xd60 [lustre]
pcc_fault+0x170/0x3d0 [lustre]
ll_fault+0x43/0x9a0 [lustre]
__do_fault+0x3c/0x170
do_fault+0x24b/0x640

Test-Parameters: testlist=sanity-pcc env=ONLY=97,ONLY_REPEAT=50
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I7e9cd0cba9d230160c90a32bef452139c23164b3
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57999
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 months agoLU-18750 kernel: update RHEL 9.5 [5.14.0-503.26.1.el9_5] 12/58212/2
Jian Yu [Tue, 25 Feb 2025 19:30:32 +0000 (11:30 -0800)]
LU-18750 kernel: update RHEL 9.5 [5.14.0-503.26.1.el9_5]

Update RHEL 9.5 kernel to 5.14.0-503.26.1.el9_5.

Test-Parameters: trivial fstype=ldiskfs mdtcount=4 mdscount=2 \
  clientdistro=el9.5 serverdistro=el9.4 testlist=sanity

Test-Parameters: trivial fstype=ldiskfs mdtcount=4 mdscount=2 \
  clientdistro=el9.4 serverdistro=el9.5 testlist=sanity

Test-Parameters: optional fstype=ldiskfs mdtcount=4 mdscount=2 \
  clientdistro=el9.5 serverdistro=el9.5 testgroup=full-dne-part-1

Test-Parameters: optional fstype=ldiskfs mdtcount=4 mdscount=2 \
  clientdistro=el9.5 serverdistro=el9.5 testgroup=full-dne-part-2

Test-Parameters: optional fstype=ldiskfs mdtcount=4 mdscount=2 \
  clientdistro=el9.5 serverdistro=el9.5 testgroup=full-dne-part-3

Test-Parameters: optional fstype=zfs mdtcount=4 mdscount=2 \
  clientdistro=el9.5 serverdistro=el9.5 testgroup=full-dne-zfs-part-1

Test-Parameters: optional fstype=zfs mdtcount=4 mdscount=2 \
  clientdistro=el9.5 serverdistro=el9.5 testgroup=full-dne-zfs-part-2

Test-Parameters: optional fstype=zfs mdtcount=4 mdscount=2 \
  clientdistro=el9.5 serverdistro=el9.5 testgroup=full-dne-zfs-part-3

Change-Id: Id381dd6628ad738fa23ddbe3746f42457269595f
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58212
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Deiter <adeiter@ddn.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-18566 lnet: dynamically configure timeouts 14/57514/6
Caleb Carlson [Tue, 18 Jun 2024 19:04:42 +0000 (13:04 -0600)]
LU-18566 lnet: dynamically configure timeouts

Add/use default LND timeouts:
* SOCKNAL_TIMEOUT_DEFAULT = 50,
* IBLND_TIMEOUT_DEFAULT   = 50,
* KFILND_TIMEOUT_DEFAULT  = 125,
* GNILND_TIMEOUT_BASE    = 60

LND timeouts default to these if not set by kernel
module params. Return only this value from the
<lnd>_timeout() functions, dropping the call to
lnet_get_lnd_timeout() which was based on the LTT and
LRC values.

Adds lnd_get_timeout() function to the lnet_lnd API
procedural struct, which returns the LND timeout of
whichever LND initialized the struct.

Use this lnd_get_timeout() function to update
the lnet_lnd tunables upon retrieval, to get current
value from module parameters.

For kfilnd, switch to using kfilnd_timeout() instead of
lnet_get_lnd_timeout(). Define KP_PURGE_LIMIT for KFI
peer purge timeout limits.

For lolnd, there's no timeout function definition, so
added conditional logic to check if the timeout function
is valid and returns a positive integer. Also, LNetGet
using the loopback LND creates the message with both
msg_txni and msg_rxni being NULL, so we check for that
condition.

Use control flow for send/recv to find correct msg NI.
Fix formatting of struct array in nidstrings.c.

Add module param path variables for ksocklnd,
kkfilnd, and kgnilnd. Renames the o2ib_modparam
variable to be more consistent:
o2iblnd_modparam_path.

Remove depency on default lnet_lnd_timeout value
in kgnilnd_timeout() function; use tunable value
instead.

Fallback to lnet_get_lnd_timeout() if tunables timeout
value is 0 (or is unset).

Modifies the 'lnetctl net set' command to allow setting
the LND timeout value via:
'lnetctl net set --net <foo> --lnd-timeout <val>'

Renames yaml_lnet_config_ni_healthv to
yaml_lnet_config_ni_value and adds arguments to broaden
the scope of the function.

Fixes bug when setting both --all and --nid for lnetctl net
set not returning -EINVAL.

Adds sanity tests to sanity-lnet.sh that tests
dynamically configured LND timeouts using values
from LND tunables set and display, and tests
that setting the LND tunable timeout value to zero
ends up defaulting to global lnd_timeout value.

Add timeout get functionality for netlink to kfilnd.

Signed-off-by: Caleb Carlson <caleb.carlson@hpe.com>
HPE-bug-id: LUS-12342
Test-Parameters: testlist="sanity-lnet"
Change-Id: Ic69a7d9d6af4cfed65d07caaf87d8b78238beab0
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57514
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-18538 ldlm: use bitmap for NS flags 86/57386/5
Timothy Day [Thu, 12 Dec 2024 05:40:32 +0000 (00:40 -0500)]
LU-18538 ldlm: use bitmap for NS flags

Use a bitmap for namespace flags in LDLM. Consolidate two
bit fields into a single bitmap. This is more in line with
Linux kernel style and more correct.

Fixes: 70b9dc5 ("LU-17812 ldlm: stack trace log for LDLM error")
Fixes: 3d4b5da ("LU-11518 ldlm: cancel LRU improvement")
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I50dd21d064147db1a93edb2e582db29c26b1c211
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57386
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-16897 tgt: note 'hole' pages 97/53297/13
Patrick Farrell [Thu, 30 Nov 2023 16:54:34 +0000 (11:54 -0500)]
LU-16897 tgt: note 'hole' pages

In order to do sparse reads, we must know which pages
correspond to holes, so we note this when the page is read
from disk.

Note something unusual: We store the hole information in
the lnb, which is a per-IO struct.  This means the hole
information is not present when a page is reused in cache.

So when a region with a hole is first read from disk, the
hole annotation is available for the transfer code, but if
the page cache is in use, this information is not available
on subsequent reads from the same pages.

This can't be avoided because the server does not have any
per-page private information for page cache pages (and ZFS
would not support this).

This isn't too costly for two reasons:
1. We default page cache to off on flash systems
2. Most data is only read once rather than many times in
quick succession

NB: It's not clear how we can efficiently get hole
information from ZFS so this is only for ldiskfs for now.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I54b1b0abeb6889163f36b315292d8b6e760d6f78
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53297
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-18475 build: compatibility updates for kernel 6.12 25/57125/7
Shaun Tancheff [Sat, 21 Dec 2024 11:01:03 +0000 (16:31 +0530)]
LU-18475 build: compatibility updates for kernel 6.12

Linux commit v6.6-rc2-11-gd77008421afd
 groups: Convert group_info.usage to refcount_t
Provide wrappers to inc/dec group_info.usage

Linux v6.12-rc1-3-g5f60d5f6bbc1
 move asm/unaligned.h to linux/unaligned.h
Add a configure test to determine which header to use

Linux v6.11-rc1-51-ga225800f322a
 fs: Convert aops->write_end to take a folio
Linux v6.11-rc1-52-g1da86618bdce
 fs: Convert aops->write_begin to take a folio
Add 'struct folio' for page vs folio signature change.

Linux v6.11-rc4-27-g11068e0b64cb
  fs: remove f_version
f_version is removed, conditionally ignore it.

Linux v6.11-rc6-86-g09022bc196d2
  mm: remove PG_error
PG_error flag and PageError wrappers are removed.

Linux v6.11-rc6-233-g99f86bbda317
  mm: remove PageMlocked
PageMLocked wrappers are removed

Linux v6.11-rc6-225-ge880034cf718
  mm: introduce page_mapcount_is_type()
PAGE_MAPCOUNT_RESERVE is removed and page_mapcount_is_type()
is used instead.

Test-Parameters: trivial
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I43928749e017c95edcbba9469550c33b00160e16
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57125
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-18743 llite: inode_to_wb() needs locking 61/58161/3
James Simmons [Sat, 22 Feb 2025 15:10:24 +0000 (10:10 -0500)]
LU-18743 llite: inode_to_wb() needs locking

When running a kernel with lockdep turned on testing shows the
following error:

WARNING: CPU: 1 PID: 37 at include/linux/backing-dev.h:291 ll_writepages+0x3dd/0x400 [lustre]
Workqueue: writeback wb_workfn (flush-lustre-ffff8f09f4)
RIP: 0010:ll_writepages+0x3dd/0x400 [lustre]
Call Trace: [ 1267.032775] ? show_regs.cold.9+0x22/0x2f
 ? __warn+0xc8/0x150 [ 1267.043623] ? ll_writepages+0x3dd/0x400 [lustre]

This due to inode_to_wb() being called without a lock. We can
pick from 3 types of locks but I went with the inode i_lock.

Change-Id: I7427041d6df102161c06cfbb05b7e26428675225
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58161
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-18639 dne: a correct check for dir split 84/57784/11
Alexander Zarochentsev [Tue, 21 Jan 2025 20:10:22 +0000 (20:10 +0000)]
LU-18639 dne: a correct check for dir split

Use the actual dir stripe count while performing
a dir split sanity check in lod_dir_declare_dir_split().

Fix lod_object_lock() to work with a striped dir with
only one stripe correctly.

Improve sanity test_230p by adding a dir split right
after the dir merges.

Also fix a typo in lustre/doc/lfs-migrate.1 .

Fixes: 2e2b16c28b ("LU-11025 dne: support directory restripe")
Fixes: 392f558f40 ("LU-17810 dne: dir restripe without fixed hash flag")
HPE-bug-id: LUS-12701
Test-Parameters: envdefinitions=ONLY=230p fstype=ldiskfs mdtcount=2 mdscount=2 testlist=sanity
Signed-off-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Change-Id: I8d8501fd09f89d03ccb1ea92a8562326110ecc24
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57784
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-18446 ptlrpc: lower CPUs latency during client I/O 39/57039/12
Bruno Faccini [Fri, 15 Nov 2024 09:24:08 +0000 (10:24 +0100)]
LU-18446 ptlrpc: lower CPUs latency during client I/O

Some CPUs with power-management can suffer with high
latency to exit from idle state.
This can have a strong impact on Lustre client perfs.
Use PM-QoS framework to guarantee usage of low-latency
power management mode, for CPUs/Cores known to be
involved to handle RPC replies for Lustre I/Os
completion.

Added PM-QoS configure checks:

PM-QoS framework is present since Kernel v3.2.
DEV_PM_QOS_RESUME_LATENCY was on DEV_PM_QOS_LATENCY before v3.15.

to handle all these cases for older kernels compatibility.

Add 4 tuneables :
  _ 'enable_pmqos' to enable/disable using PM-QoS to
    bump CPUs latency
  _ 'pmqos_latency_max_usec' to allow modifying the max
    latency value to be used
  _ 'pmqos_default_duration_usec' to allow modifying
    the timeout value to unset low latency
  _ 'pmqos_use_stats_for_duration to enable/disable
    using the per-target stats to set low latency timeout

Here is a table summarising the single node fio (randread)
performance :
NJOBS Target perf Original perf perf with patch
1           2.5              1.05            2.56
2           5.24             2.14            5.26
4           10.8             4.36            10.5
8           21.3             8.68            20.9
16          40               16.9            40
32          65.4             32.2            64.1
64          84               56.8            83.4
128         90.8             79.6            89.9
192         91.7             85.2            91.5
256         91.9             87.4            91.8
320         91.8             89.7            91.9

Signed-off-by: Bruno Faccini <bfaccini@nvidia.com>
Change-Id: I784a699f355da413db5029c6c7584ce3ee4ba9e1
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57039
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-18738 utils: avoid statx() of root of mounted FS 35/58135/2
Olaf Faaland [Tue, 18 Feb 2025 04:46:38 +0000 (20:46 -0800)]
LU-18738 utils: avoid statx() of root of mounted FS

When looking for a specific mounted lustre file system by path, avoid
the stat() or statx() call on lustre file systems whose mountpoints do
not match the given path.

This avoids hangs if the client is disconnected from MDT0 of other
mounted file systems, but the desired file system is reachable.

Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Change-Id: I1c67214f107ae2afe34d050470155807063bda51
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58135
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>