Whamcloud - gitweb
fs/lustre-release.git
5 weeks agoLU-18528 test: wait for quota pool nr 09/58409/2
Hongchao Zhang [Fri, 14 Mar 2025 08:54:56 +0000 (16:54 +0800)]
LU-18528 test: wait for quota pool nr

In test_68 in sanity_quota, the number of the quota pool in QMT
could be delayed to update, then it should wait for its update.

Test-Parameters: trivial testlist=sanity-quota fstype=zfs env=ONLY=68,ONLY_REPEAT=50
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Change-Id: I2e1964cf39a493d68ddd6463aaa28cf173951979
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58409
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 weeks agoLU-17950 ldiskfs: race in ext4_inode_attach_jinode 81/58381/4
Li Dongyang [Wed, 12 Mar 2025 09:28:53 +0000 (20:28 +1100)]
LU-17950 ldiskfs: race in ext4_inode_attach_jinode

A race condition could happen when multiple threads
trying to attach jinode for the same inode:

Thread 1:
ext4_map_blocks
  ext4_inode_attach_jinode
    spin_lock(&inode->i_lock)
    ei->jinode = jinode
->
    jbd2_journal_init_jbd_inode(ei->jinode, inode)

Thread 2:
ext4_map_blocks
  ext4_inode_attach_jinode
    if (ei->jinode || !EXT4_SB(inode->i_sb)->s_journal)
    return 0;
  ext4_jbd2_inode_add_write
->  jbd2_journal_file_inode

The problem is in ext4_inode_attach_jinode() the initial check
of ei->jinode is not protected by inode->i_lock,
thread 2 could go ahead and use the not yet initialized jinode
in jbd2_journal_file_inode(), and thread 1 later will
use jbd2_journal_init_jbd_inode, corrupting the jinode.

Note this issue is specific to ldiskfs because of
ext4-attach-jinode-in-writepages.patch added
ext4_inode_attach_jinode() to make sure jinode is initialized
before calling ext4_jbd2_inode_add_write().

Change-Id: Iafd7aa9537505afbf4bc53fef40ea3aa0a94b7da
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58381
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 weeks agoLU-18779 lnet: lnetctl SIGSEGV in lnetctl.c getopt_internal() 22/58322/7
Frank Sehr [Thu, 6 Mar 2025 20:19:38 +0000 (12:19 -0800)]
LU-18779 lnet: lnetctl SIGSEGV in lnetctl.c getopt_internal()

Variable optindex was out of range. The whole check could be
simplified (only check optarg) if no negative values for verbose
 are expected. Also modified for peer.

Test-Parameters: trivial
Signed-off-by: Frank Sehr <fsehr@whamcloud.com>
Change-Id: I64aad7527377b098479e93040a84b0865b02de28
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58322
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Manish Regmi <mregmi@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
5 weeks agoLU-18676 tests: random write to set file size 83/58083/9
Hongchao Zhang [Sat, 15 Mar 2025 19:14:58 +0000 (03:14 +0800)]
LU-18676 tests: random write to set file size

The sanity-quota.sh test_49 is using "createmany -S 4k"
to set the size of a new file instead of writing all the
actual file data.  Add a new "-W SIZE" option to write
the specified number of random bytes instead of only
writing a few bytes at the end of the file.

This avoids issues with sparse files or data compression
resulting in less space being allocated than expected.

Test-Parameters: testlist=sanity-quota fstype=zfs env=ONLY=49,ONLY_REPEAT=50
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Change-Id: Ida93da881b48e6fdd85b64e90991b85f28de63d4
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58083
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
5 weeks agoLU-17777 tests: Exclude Files when comparing dir structure 29/58129/8
Arshad Hussain [Wed, 19 Feb 2025 11:04:37 +0000 (16:34 +0530)]
LU-17777 tests: Exclude Files when comparing dir structure

This patch Excludes /etc/yum* and /etc/pki* as corner case
which is updated by RHEL asynchronously breaking the test-case.

Moves search folder from "/etc /bin" to "/etc /usr/bin" as
/bin can be a symlink

Removes arbitrary return code of 18 and 22 with $?. For
diff this should be 1 for mismatch and 2 for files not found

Renames constant error message with unique message when
rebuilding dir structure before and after remount

Test-Parameters: trivial testlist=runtests
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I8f7824b930f6286e7e5744ff403a02cec280075d
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58129
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Deiter <adeiter@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
6 weeks agoNew tag 2.16.53 2.16.53 v2_16_53
Oleg Drokin [Wed, 19 Mar 2025 23:38:23 +0000 (19:38 -0400)]
New tag 2.16.53

Change-Id: I4a6d3cff8b78d64660d2848d02bbd0f624ea4e7e
Signed-off-by: Oleg Drokin <green@whamcloud.com>
6 weeks agoLU-18740 mgs: size_t in contain_valid_fsname() 42/58142/3
Alex Zhuravlev [Fri, 21 Feb 2025 03:11:06 +0000 (06:11 +0300)]
LU-18740 mgs: size_t in contain_valid_fsname()

to fix a build warning with gcc 11.5.0 (Rocky 9.3):

lustre/mgs/mgs_llog.c:4995:13: error: '__builtin_memcmp_eq'
 specified bound [18446744071562067968, 0] exceeds maximum
 object size 9223372036854775807 [-Werror=stringop-overread]
 4995 |         if (memcmp(buf, fsname, namelen) != 0)
      |             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~

Test-Parameters: trivial
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I77adc19e4d79d4a84a2cfe3c9601f5536ad8cc81
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58142
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
6 weeks agoLU-18794 kernel: update RHEL 9.5 [5.14.0-503.31.1.el9_5] 78/58378/2
Jian Yu [Wed, 12 Mar 2025 07:36:46 +0000 (00:36 -0700)]
LU-18794 kernel: update RHEL 9.5 [5.14.0-503.31.1.el9_5]

Update RHEL 9.5 kernel to 5.14.0-503.31.1.el9_5.

Test-Parameters: trivial fstype=ldiskfs mdtcount=4 mdscount=2 \
  clientdistro=el9.5 serverdistro=el9.4 testlist=sanity

Test-Parameters: trivial fstype=ldiskfs mdtcount=4 mdscount=2 \
  clientdistro=el9.4 serverdistro=el9.5 testlist=sanity

Test-Parameters: optional fstype=ldiskfs mdtcount=4 mdscount=2 \
  clientdistro=el9.5 serverdistro=el9.5 testgroup=full-dne-part-1

Test-Parameters: optional fstype=ldiskfs mdtcount=4 mdscount=2 \
  clientdistro=el9.5 serverdistro=el9.5 testgroup=full-dne-part-2

Test-Parameters: optional fstype=ldiskfs mdtcount=4 mdscount=2 \
  clientdistro=el9.5 serverdistro=el9.5 testgroup=full-dne-part-3

Test-Parameters: optional fstype=zfs mdtcount=4 mdscount=2 \
  clientdistro=el9.5 serverdistro=el9.5 testgroup=full-dne-zfs-part-1

Test-Parameters: optional fstype=zfs mdtcount=4 mdscount=2 \
  clientdistro=el9.5 serverdistro=el9.5 testgroup=full-dne-zfs-part-2

Test-Parameters: optional fstype=zfs mdtcount=4 mdscount=2 \
  clientdistro=el9.5 serverdistro=el9.5 testgroup=full-dne-zfs-part-3

Change-Id: Ie6ec03efef1ec6f5c2d165a0e0ac6c3d3a4fd54c
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58378
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Deiter <adeiter@ddn.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
6 weeks agoLU-18795 kernel: update RHEL 8.10 [4.18.0-553.44.1.el8_10] 77/58377/2
Jian Yu [Wed, 12 Mar 2025 07:31:46 +0000 (00:31 -0700)]
LU-18795 kernel: update RHEL 8.10 [4.18.0-553.44.1.el8_10]

Update RHEL 8.10 kernel to 4.18.0-553.44.1.el8_10.

Test-Parameters: trivial fstype=ldiskfs mdtcount=4 mdscount=2 \
  clientdistro=el8.10 serverdistro=el8.10 testlist=sanity

Test-Parameters: optional fstype=zfs mdtcount=4 mdscount=2 \
  env=SANITY_EXCEPT="66 413" \
  clientdistro=el8.10 serverdistro=el8.10 testlist=sanity

Test-Parameters: optional fstype=ldiskfs mdtcount=4 mdscount=2 \
  clientdistro=el8.10 serverdistro=el8.10 testgroup=full-dne-part-1

Test-Parameters: optional fstype=ldiskfs mdtcount=4 mdscount=2 \
  clientdistro=el8.10 serverdistro=el8.10 testgroup=full-dne-part-2

Test-Parameters: optional fstype=ldiskfs mdtcount=4 mdscount=2 \
  clientdistro=el8.10 serverdistro=el8.10 testgroup=full-dne-part-3

Test-Parameters: optional fstype=zfs mdtcount=4 mdscount=2 \
  clientdistro=el8.10 serverdistro=el8.10 testgroup=full-dne-zfs-part-1

Test-Parameters: optional fstype=zfs mdtcount=4 mdscount=2 \
  clientdistro=el8.10 serverdistro=el8.10 testgroup=full-dne-zfs-part-2

Test-Parameters: optional fstype=zfs mdtcount=4 mdscount=2 \
  clientdistro=el8.10 serverdistro=el8.10 testgroup=full-dne-zfs-part-3

Change-Id: I9bd1f29d006c9da858c941ae81352c75a332a36f
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58377
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Deiter <adeiter@ddn.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
6 weeks agoLU-17000 misc: avoid memory leaks in error handling 62/58362/3
Andreas Dilger [Tue, 11 Mar 2025 02:53:19 +0000 (20:53 -0600)]
LU-17000 misc: avoid memory leaks in error handling

Fix wrong GOTO() label "out_free:" instead of "out_record_free:".

Quiet false positive for leak in krb5_make_checksum(). Since
"req == NULL" is never returned by cfs_crypto_hash_init(), then
cfs_crypto_hash_final() is always called. Coverity is confused.

CoverityID: 440607 ("Resource leak")
CoverityID: 457047 ("Resource leak")

Test-Parameters: trivial
Fixes: 11eef3f735 ("LU-10499 pcc: get PCC state for file without opening")
Fixes: 553d93361d ("LU-8602 gss: get rid of cfs_crypto_hash_desc")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I725921ad89534b8ff2d8bcd526fceca3fcd90d04
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58362
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
6 weeks agoLU-17000 llite: fix memory leaks in error handling 61/58361/2
Andreas Dilger [Tue, 11 Mar 2025 01:39:58 +0000 (19:39 -0600)]
LU-17000 llite: fix memory leaks in error handling

Ensure that allocations are freed before returning in case of errors.

CoverityID: 457069 ("Resource leak")
CoverityID: 457073 ("Resource leak")
CoverityID: 457077 ("Resource leak")

Test-Parameters: trivial
Fixes: ae828cd3b0 ("LU-4684 llite: add lock for dir layout data")
Fixes: ed4a625d88 ("LU-13717 sec: filename encryption - digest support")
Fixes: 2e2b16c28b ("LU-11025 dne: support directory restripe")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I5ff33a7243e1f536e5308f61451f205f232540e5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58361
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
6 weeks agoLU-18789 ldiskfs: ldiskfs patch adjustments for ubuntu 24.04.2 55/58355/3
Shuichi Ihara [Mon, 10 Mar 2025 01:35:53 +0000 (10:35 +0900)]
LU-18789 ldiskfs: ldiskfs patch adjustments for ubuntu 24.04.2

Ubuntu24.04.2 is based on linux-6.11.0 by default.
Only a few ldiskfs patch adjustments are needed for it
to build server modules properly.

Test-Parameters: trivial
Signed-off-by: Shuichi Ihara <sihara@ddn.com>
Change-Id: Ie476ef12568b8ecb94df38b48b51646dc42923da
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58355
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
6 weeks agoLU-16518 llite: remove unused ll_default_lmv_inherited() 52/58352/2
Timothy Day [Sat, 8 Mar 2025 22:08:56 +0000 (17:08 -0500)]
LU-16518 llite: remove unused ll_default_lmv_inherited()

This function stopped being used in a previous patch, but it
was never removed. So let's remove it now.

Fixes: 388a185eace0 ("LU-15971 llite: implicit default LMV inherit")
Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I7aef4ad1a08bf55abd6ec2cb906b4198dc3185f0
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58352
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
6 weeks agoLU-16518 lod: remove unused dt_object_qos_mkdir() 51/58351/2
Timothy Day [Sat, 8 Mar 2025 22:03:30 +0000 (17:03 -0500)]
LU-16518 lod: remove unused dt_object_qos_mkdir()

This function was rendered obsolete in a previous
patch, but was not removed.

Fixes: c1d0a355a6a6 ("LU-12624 lod: alloc dir stripes by QoS")
Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I33ce5a5b745bff7414df8aa04ecf72d68cf8f715
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58351
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
6 weeks agoLU-18785 build: ofd, ptlrpc missing prototypes 42/58342/3
Shaun Tancheff [Sat, 8 Mar 2025 03:51:03 +0000 (10:51 +0700)]
LU-18785 build: ofd, ptlrpc missing prototypes

nodemap_range.c:271:6: error: no previous prototype
  for '__range_delete' [-Werror=missing-prototypes]
  271 | void __range_delete(struct nodemap_range_tree *nm_range_tree,
      |      ^~~~~~~~~~~~~~

ofd_oss.c:430:5: error: no previous prototype for 'oss_mod_init'
  [-Werror=missing-prototypes]
  430 | int oss_mod_init(void)
      |     ^~~~~~~~~~~~

Test-Parameters: trivial
Fixes: 0ea23e01945 ("LU-13307 nodemap: have nodemap_add_member support large NIDs")
Fixes: b84f014d733 ("LU-14291 ofd: merge ost module into ofd")
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Ie846dfae7ceb511318ab4ccd9494a633129c2c4d
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58342
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
6 weeks agoLU-18784 dkms: add systemd check for dkms-debs 41/58341/3
Shaun Tancheff [Sat, 8 Mar 2025 01:44:51 +0000 (08:44 +0700)]
LU-18784 dkms: add systemd check for dkms-debs

The lustre-client-utils packaging of:
  /usr/lib/systemd/system/lnet.service
is conditional upon the presence of systemd.

Include the check when building the dkms-debs target.

Test-Parameters: trivial testgroup=full-dkms
HPE-bug-id: LUS-12776
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I583338cda8fd49cbb845ed71bb2cb34a1db3cc74
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58341
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Caleb Carlson <caleb.carlson@hpe.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
6 weeks agoLU-18687 compat: move generic-radix-tree to lustre_compat 48/58148/2
Timothy Day [Fri, 21 Feb 2025 17:20:06 +0000 (17:20 +0000)]
LU-18687 compat: move generic-radix-tree to lustre_compat

Migrate the backported radix tree code to lustre_compat.

Eventually, all of the Lustre/LNet compatability code
will live in lustre_compat - maintaining a clear
separation from the functional code in Lustre and LNet.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Iaf6fed877b23829be948f1347d21e1ff7b9ce5a9
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58148
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
6 weeks agoLU-18687 compat: move glob to lustre_compat 45/58145/2
Timothy Day [Fri, 21 Feb 2025 15:55:18 +0000 (15:55 +0000)]
LU-18687 compat: move glob to lustre_compat

Migrate the backported glob code to lustre_compat.

Eventually, all of the Lustre/LNet compatability code
will live in lustre_compat - maintaining a clear
separation from the functional code in Lustre and LNet.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I7e57326a0ed10225e2ee866071ea7c3d259d29d4
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58145
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
6 weeks agoLU-18728 tests: use urandom to really consume ZFS space 15/58115/8
Bruno Faccini [Tue, 18 Feb 2025 17:36:12 +0000 (18:36 +0100)]
LU-18728 tests: use urandom to really consume ZFS space

It appears newer ZFS is using data compression by default, so reading
from /dev/zero results in files not consuming the expected amount of
space.  Instead, read from /dev/urandom for ZFS to write files in
sanity and conf-sanity to ensure they fill the OSTs, or the image
to be used for target creation, as expected.

Test-Parameters: testgroup=review-zfs env=ZFS_MKFS_OPTS="compression=on"
Signed-off-by: Bruno Faccini <bfaccini@nvidia.com>
Change-Id: I7b4e95032608d8db82c75e4b6dd1ec5beb6f8d99
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58115
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
6 weeks agoLU-18694 sec: nodemap local root user capabilities 66/57966/13
Sebastien Buisson [Fri, 24 Jan 2025 15:31:37 +0000 (16:31 +0100)]
LU-18694 sec: nodemap local root user capabilities

Add a new 'local_admin' rbac role, on by default. The purpose of this
new role is to keep capabilities for root even if it is mapped or
offset. This allows to have root mapped to a non-privileged storage id
while still being able to perform 'admin-like' tasks thanks to
capabilities, such as changing file permissions or file ownership.

Note that setquota and changing project id is also impacted by the
local_admin role. When enabled, root on the client that gets mapped on
file system side is still able to interact with those.

Be aware that if root is squashed, then capabilities are dropped as
for any other regular user.

New test sanity-sec test_64h exercises the local_admin role.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I5832b21106b2829134a596c2aacf04839be856e9
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57966
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Marc Vef <mvef@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
6 weeks agoLU-18662 tests: skip fstrim on unsupported devices 51/57851/19
Alex Zhuravlev [Wed, 22 Jan 2025 06:00:31 +0000 (09:00 +0300)]
LU-18662 tests: skip fstrim on unsupported devices

if an underlying device doesn't support fstrim, then never try it
again.

Fixes: 6872cf9a36 ("LU-17722 tests: trim tmpfs from wait_delete_completed()")
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Ie7e49800ed0161c968e453a531b9701f3459a318
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57851
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
6 weeks agoLU-18643 tests: Do not create subdirectory on client mount 07/57807/11
Marc Vef [Thu, 16 Jan 2025 18:03:14 +0000 (19:03 +0100)]
LU-18643 tests: Do not create subdirectory on client mount

When calling zconf_mount_clients() with a FILESET, the Lustre client
is mounted at the corresponding subdirectory specified by the FILESET.

However, when the subdirectory does not exist, it automatically
creates it transparently. This may hide bugs, e.g., when a test needs
to verify that mounting against a non-existing directory is not
possible.

This patch removes the silent creation of the directory on client
mount, so that it is the caller's responsibility that the subdirectory
exists before mounting. Currently, no tests are relying on this
functionality as they already create the subdirectory themselves.
Therefore no test needs to be modified.

Test-Parameters: mdtcount=4 mdscount=2 env=ONLY="247 413" testlist=sanity
Signed-off-by: Marc Vef <mvef@whamcloud.com>
Change-Id: I900c4bff79e6b5bde541eb4e852e42cde01820e3
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57807
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
6 weeks agoLU-16565 mdc: Remove ldlm is,set,clear macros 47/57547/2
Timothy Day [Fri, 20 Dec 2024 04:14:59 +0000 (23:14 -0500)]
LU-16565 mdc: Remove ldlm is,set,clear macros

Replaces ldlm_{is,set,clear} macros with the direct flag
names.

The patch has been generated with the coccinelle script in
contrib/cocci/ldlm_flags.cocci.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I03dfce5398c17201e4f18b3c9792daab751fa8e6
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57547
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
6 weeks agoLU-8066 nodemap: migrate to debugfs 01/57401/6
James Simmons [Fri, 28 Feb 2025 14:26:23 +0000 (09:26 -0500)]
LU-8066 nodemap: migrate to debugfs

The nodemap interface in proc is for administration purposes only
so we can safely move it to debugfs.

Test-Parameters: trivial testlist=sanity-sec
Change-Id: I0797bb79896ae5d9fa3bf9088b97b10505762565
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57401
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
6 weeks agoLU-930 man: add proper documentation for replace_nids command 70/57270/6
Artem Blagodarenko [Tue, 3 Dec 2024 23:22:22 +0000 (18:22 -0500)]
LU-930 man: add proper documentation for replace_nids command

The current entry in the lctl.8 man page and manual entry are totally
lacking in explanation of what the various NIDs mean.
It should explain the behaviour of failover NIDs.

A separate "lctl replace_nids" man page was created and some
additional information added.

Test-Parameters: trivial
Signed-off-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Change-Id: I35e8fa26c109811a7411a73cd40ad811c2256e1b
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57270
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frederick Dilger <fdilger@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
6 weeks agoLU-8066 exports: move procfs exports to debugfs 13/57013/12
James Simmons [Tue, 25 Feb 2025 23:08:12 +0000 (18:08 -0500)]
LU-8066 exports: move procfs exports to debugfs

The server side has a exports proc directory with several entries.
Upstreaming requires Lustre not to use the proc directory so
we can move the exports directory to debugfs. This is server side
so the root only issue should be limited. This step will make
more of the stats Netlink work much easier.

Change-Id: I73e38813f049cf563cdc7e277e4fadecd5a94e98
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57013
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
6 weeks agoLU-11850 obd: support the rest of "stats" with Netlink 05/57305/3
James Simmons [Wed, 26 Feb 2025 18:47:19 +0000 (11:47 -0700)]
LU-11850 obd: support the rest of "stats" with Netlink

Migrate the remaining "stats" files to the debugfs Netlink API
except for the exports stats. Its is possible that we lack
a kobject and a debugfs dentry so we can end up in a case
that we can't derive a name. So change the API to supply
the stat source name instead. Update the stats packet size
calculate based on the new debugging info in the function
lnet_genl_parse_list().

Test-Parameters: trivial
Change-Id: If52dfb2807cbdcd9a24e9334edfa2101a8483fdd
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57305
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
6 weeks agoLU-11077 utils: --client option for set_param 58/55858/20
Frederick Dilger [Wed, 12 Jun 2024 20:43:08 +0000 (16:43 -0400)]
LU-11077 utils: --client option for set_param

Added new [--client|-C[FSNAME]] option for 'lctl set_param' which
writes the parameter to the local /etc/lustre/mount.client.params
config file. Upon each Lustre client mount those parameters will
be set on the local node. If FSNAME was provided, the parameters
will be saved in the mount-specific /etc/lustre/mount.FSNAME.params
config file, and will be set (and override) the more generic client
mount parameters on that node when that filesystem is mounted.
However only parameters containing FSNAME can be set to their
respective params config file to avoid generic parameters that are
only supposed to affect a single filesystem, actually affecting all
of them.

Can be used together with [--delete|-d] to remove the parameter from
the given log file. If [--delete|-d] is specified without -C or -P
it will enable -P by default. A warning message will be printed when
this happens so users are aware of what's going on.

Signed-off-by: Frederick Dilger <fdilger@whamcloud.com>
Change-Id: Iec0b9bfb5e259154ed2439e6e505b826a888905f
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55858
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
6 weeks agoLU-11850 lov: migrate completely to lu_tgt_descs API 59/51959/42
James Simmons [Wed, 5 Mar 2025 19:50:53 +0000 (14:50 -0500)]
LU-11850 lov: migrate completely to lu_tgt_descs API

The lov target handling was written before the generic lu_tgt
was written. Migrate to this newer API so lov can be treated
the same like lmv and lod. With the changes we have the new
lov_foreach_tgt() macro that tranverses all the registered
targets of total amount ltd->ltd_tgts_size. Also lov_tgt()
was created to extract a target by its index. Internally
a bitmap is used to tell if the tgt has been setup and
ltd->ltd_tgts_size defines the largest possible index.

Another change is that since that largest OST offset that
a striped lustre file can have is 65503 we reduce the
largest index possible for an OST since the last OSTs
could never be used.

Fixes: 1a6ef725c2 ("LU-16938 utils: setstripe overstripe multiple OST count")
Change-Id: If3f53b2a4589f93a024fa026ba377e2175282c29
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51959
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Feng Lei <flei@ddn.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
6 weeks agoLU-18776 mdt: prevent multiple data discard calls 02/58302/3
Mikhail Pershin [Wed, 5 Mar 2025 14:47:37 +0000 (17:47 +0300)]
LU-18776 mdt: prevent multiple data discard calls

The mdt_dom_discard_data() might be called multiple times
for the same object. That creates cyclical locks for no
reason and moreover their callbacks are executed in the
same thread recursively causing stack overflow

Patch introduces mdt_object flag mot_discard_done to
indicate that data discard was initiated once and no
need for another one.
Additionally patch don't allow to use the same thread
for lock callback if ldlm_is_ast_discard_data() is true

Fixes: 291ac6e692 ("LU-17078 ldlm: do not spin up thread for local cancels")
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I7dc5d0da93a38e04267e007f5132ddb20788f18f
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58302
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
6 weeks agoLU-18769 lnet: lnetctl memory corruption because of buffer overflow 88/58288/4
Manish Regmi [Mon, 3 Mar 2025 23:22:00 +0000 (15:22 -0800)]
LU-18769 lnet: lnetctl memory corruption because of buffer overflow

Sometimes the the user passed name is larger than the size of
lnet_dlc_intf_descr.intf_name. Add proper validation checks before
strncpy and strcpy so that the buffer does not overflow.

Test-Parameters: trivial
Signed-off-by: Manish Regmi <mregmi@ddn.com>
Change-Id: Ifa867cd60ded64fcefe0a6b948f34e9f542e6e04
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58288
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
6 weeks agoLU-18448 llite: read dir on open 69/57069/16
Alexey Lyashkov [Fri, 15 Nov 2024 09:16:04 +0000 (12:16 +0300)]
LU-18448 llite: read dir on open

Let's read some pages at directory start,
a clients needs it probably.

walk over ~100k directories with 150 files on last leaf.

readdir on open enabled.

    real    0m39.977s
    user    0m0.121s
    sys     0m7.161s

readdir on open disabled

    real    1m18.106s
    user    0m0.151s
    sys     0m15.666s

HPE-bug-id: LUS-7695
Signed-off-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Change-Id: Iaa674ce0d2e5723b380d7ca09407b27a90bc37f5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57069
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
6 weeks agoLU-18177 lustre: use enum cl_attr_valid instead of unsigned 82/56182/4
Bobi Jam [Wed, 28 Aug 2024 14:38:59 +0000 (22:38 +0800)]
LU-18177 lustre: use enum cl_attr_valid instead of unsigned

The last parameter of coo_attr_update() should be enum cl_attr_valid
instead of __u32/unsigned int

Test-Parameters: trivial
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I1e02f1f3621d82d5e279f6d37571ea43929f083e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/56182
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
7 weeks agoLU-18773 osc: initialize index_orig in osc_brw_prep_request 96/58296/2
Jian Yu [Tue, 4 Mar 2025 22:08:41 +0000 (14:08 -0800)]
LU-18773 osc: initialize index_orig in osc_brw_prep_request

This patch initializes index_orig in osc_brw_prep_request() to
fix the following error:

  In function 'osc_brw_prep_request':
  error: 'index_orig' may be used uninitialized in this function
  [-Werror=maybe-uninitialized]
       brwpg->pg->index = index_orig;
       ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~

Change-Id: I97188ea21adfa25950814a04e4f6ffdb9b763712
Test-Parameters: trivial
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58296
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
7 weeks agoLU-9633 llite: Add kernel doc style for lustre/llite/*.c (Part 7) 79/58279/2
Arshad Hussain [Mon, 3 Mar 2025 07:29:33 +0000 (12:59 +0530)]
LU-9633 llite: Add kernel doc style for lustre/llite/*.c (Part 7)

This patch converts existing functional comments
to kernel doc style comments and removes '/**' for
comments which is not meant to be a kernel-doc comment

./kernel-doc -v -none lustre/llite/symlink.c lustre/llite/rw26.c
llite/symlink.c:244: info: Scanning doc for function ll_getattr_link
llite/rw26.c:36: info: Scanning doc for function ll_invalidate_folio
llite/rw26.c:92: info: Scanning doc for function ll_invalidatepage
llite/rw26.c:719: info: Scanning doc for function ll_prepare_partial_page

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I9c9fd4c5c1edc426df42165c11c54fdd694bf722
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58279
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Anjus George <georgea@ornl.gov>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 weeks agoLU-9633 llite: Add kernel doc style for lustre/llite/*.c (Part 6) 78/58278/2
Arshad Hussain [Thu, 27 Feb 2025 15:00:54 +0000 (20:30 +0530)]
LU-9633 llite: Add kernel doc style for lustre/llite/*.c (Part 6)

This patch converts existing functional comments
to kernel doc style comments and removes '/**' for
comments which is not meant to be a kernel-doc comment

Tested:
./kernel-doc -v -none lustre/llite/llite_mmap.c lustre/llite/llite_nfs.c
lustre/llite/llite_mmap.c:72: info: Scanning doc for function ll_fault_io_init
lustre/llite/llite_mmap.c:266: info: Scanning doc for function ll_fault0
lustre/llite/llite_mmap.c:536: info: Scanning doc for function ll_vm_open
lustre/llite/llite_mmap.c:560: info: Scanning doc for function ll_vm_close
lustre/llite/llite_nfs.c:234: info: Scanning doc for function ll_encode_fh

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I31cc93b570db31550aa3bdc919dbd8ce82ce47a4
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58278
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Anjus George <georgea@ornl.gov>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 weeks agoLU-9633 llite: Add kernel doc style for lustre/llite/*.c (Part 5) 77/58277/2
Arshad Hussain [Mon, 3 Mar 2025 07:09:42 +0000 (12:39 +0530)]
LU-9633 llite: Add kernel doc style for lustre/llite/*.c (Part 5)

This patch converts existing functional comments
to kernel doc style comments and removes '/**' for
comments which is not meant to be a kernel-doc comment

Tested:
./kernel-doc -v -none lustre/llite/namei.c lustre/llite/lproc_llite.c
llite/namei.c:101: info: Scanning doc for function ll_iget
llite/namei.c:1299: info: Scanning doc for function ll_atomic_open
llite/lproc_llite.c:83: info: Scanning doc for function ll_stats_pid_write
llite/lproc_llite.c:1355: info: Scanning doc for function default_easize_show
llite/lproc_llite.c:1383: info: Scanning doc for function default_easize_store

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I8178ca5c2605341f13e307ef5e194f2b4ba8a5bd
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58277
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Anjus George <georgea@ornl.gov>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 weeks agoLU-18760 dkms: race on clobber and create of modules.order 61/58261/3
Shaun Tancheff [Fri, 28 Feb 2025 03:28:52 +0000 (10:28 +0700)]
LU-18760 dkms: race on clobber and create of modules.order

DKMS builds fail occasionally with an error:

cat: /var/lib/dkms/.../build//modules.order: No such file or directory
  MODPOST /var/lib/dkms/.../build/Module.symvers

This appears to be a make bug trying where a path with //
is not understood correctly.

Remove the unnecessary injection of / in the list of SUBDIRS
to be built in when ldiskfs is not enabled.

Test-Parameters: trivial
HPE-bug-id: LUS-12672
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I6dda02133115076b076e6adf2ebabd10895af643
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58261
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Caleb Carlson <caleb.carlson@hpe.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 weeks agoLU-18687 compat: move xarray to lustre_compat 14/58114/4
Timothy Day [Wed, 12 Feb 2025 01:35:30 +0000 (20:35 -0500)]
LU-18687 compat: move xarray to lustre_compat

Migrate the backported xarray code to lustre_compat.
Along the way, create the needed build infrastructure
for lustre_compat. Currently, lustre_compat is built
into libcfs.ko.

Eventually, all of the Lustre/LNet compatability code
will live in lustre_compat - maintaining a clear
separation from the functional code in Lustre and LNet.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I74249d0b5714bee3549bf42a8fede3f279bc37ee
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58114
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 weeks agoLU-18691 quota: quota interop check for 64k page clients 61/57961/6
Shaun Tancheff [Fri, 7 Feb 2025 13:21:53 +0000 (20:21 +0700)]
LU-18691 quota: quota interop check for 64k page clients

When hitting the end of available quota a race condition can be hit
which allows an 64k unaligned I/O to be submitted and causes the
node to hang indefinitely.

This happens when a partial write hits quota limits and a subsequent
write is not aligned on 64k page boundary triggering a hang due to
64k vs 4k page aligned transfers.

HPE-bug-id: LUS-12724
Test-Parameters: testlist=sanity-quota clientarch=ppc64le clientdistro=el8.9 serverdistro=el9.4 env=ONLY=88,ONLY_REPEAT=10
Test-Parameters: testlist=sanity-quota clientarch=ppc64le clientdistro=el8.9 serverdistro=el8.9 serverversion=2.15.4 env=ONLY=88,ONLY_REPEAT=10
Test-Parameters: testlist=sanity-quota clientarch=aarch64 clientdistro=el9.3 serverdistro=el8.10 env=ONLY=88,ONLY_REPEAT=10
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I0f8638062f8b0e57207695c45e1fccbd7492c32d
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57961
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: xinliang <xinliang.liu@linaro.org>
Reviewed-by: Sergey Cheremencev <scherementsev@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 weeks agoLU-16565 mdt: Remove ldlm is,set,clear macros 45/57545/2
Timothy Day [Fri, 20 Dec 2024 04:14:11 +0000 (23:14 -0500)]
LU-16565 mdt: Remove ldlm is,set,clear macros

Replaces ldlm_{is,set,clear} macros with the direct flag
names.

The patch has been generated with the coccinelle script in
contrib/cocci/ldlm_flags.cocci.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I866c23748a176e8cc4e391e5111d1133caf2988f
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57545
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 weeks agoLU-16565 osc: Remove ldlm is,set,clear macros 44/57544/2
Timothy Day [Fri, 20 Dec 2024 04:13:50 +0000 (23:13 -0500)]
LU-16565 osc: Remove ldlm is,set,clear macros

Replaces ldlm_{is,set,clear} macros with the direct flag
names.

The patch has been generated with the coccinelle script in
contrib/cocci/ldlm_flags.cocci.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I40f627f5dbaa072b4a1e91adbca1c88d727fb84d
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57544
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 weeks agoLU-16565 quota: Remove ldlm is,set,clear macros 43/57543/2
Timothy Day [Fri, 20 Dec 2024 04:13:09 +0000 (23:13 -0500)]
LU-16565 quota: Remove ldlm is,set,clear macros

Replaces ldlm_{is,set,clear} macros with the direct flag
names.

The patch has been generated with the coccinelle script in
contrib/cocci/ldlm_flags.cocci.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I364cb11cbbfc00f133e1193204a920233b3a1b37
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57543
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 weeks agoLU-16565 target: Remove ldlm is,set,clear macros 42/57542/3
Timothy Day [Fri, 20 Dec 2024 04:12:17 +0000 (23:12 -0500)]
LU-16565 target: Remove ldlm is,set,clear macros

Replaces ldlm_{is,set,clear} macros with the direct flag
names.

The patch has been generated with the coccinelle script in
contrib/cocci/ldlm_flags.cocci.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I63198e3278d9be930c768b64ffdccc9cd1e74a76
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57542
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 weeks agoLU-12885 mds: add enums for MDS_OPEN flags (4/4) 12/56812/12
Arshad Hussain [Mon, 28 Oct 2024 04:56:56 +0000 (00:56 -0400)]
LU-12885 mds: add enums for MDS_OPEN flags (4/4)

This patch is fourth of the series of patch that separates
kernel open flags from MDS open flags

This patch adds function ll_kernel_to_mds_open_flags() for
one place convert of kernel flags (fmode) to MDS flags

This patch removes macros O_LOV_DELAY_CREATE_1_8 and
O_LOV_DELAY_CREATE_MASK everywhere as it is was only
required for interop with applications written for Lustre
1.8 clients and not used any more

This patch adds function ll_lov_delay_create_is_set() and
ll_lov_delay_create_clear() to set and remove O_LOV_DELAY_CREATE
flag if found in struct file->fmode

This patch removes remaining fmode to mds_open_flags wherever
it was remaining

Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: Ic125dc0c7fa54888fddf435c117de9d304ea8708
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/56812
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 weeks agoLU-16796 ptlrpc: Change struct ptlrpc_reply_state to use kref 64/56364/3
Arshad Hussain [Wed, 11 Sep 2024 07:17:19 +0000 (03:17 -0400)]
LU-16796 ptlrpc: Change struct ptlrpc_reply_state to use kref

This patch changes struct ptlrpc_reply_state to use
kref instead of atomic_t

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I15d4982e709fe420b1fade4108581fbc7669058e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/56364
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 weeks agoLU-16796 lfsck: Change lfsck_instance to use refcount_t 95/56195/2
Arshad Hussain [Thu, 29 Aug 2024 09:34:01 +0000 (05:34 -0400)]
LU-16796 lfsck: Change lfsck_instance to use refcount_t

This patch changes struct lfsck_instance to use
refcount_t instead of atomic_t

Test-Parameters: trivial testlist=sanity-lfsck
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I9bf3337ed7b68dbd44e723bf7c1374a8e3a07eb7
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/56195
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 weeks agoLU-16796 ptlrpc: Change struct ptlrpc_request_set to use kref 59/53459/13
Arshad Hussain [Thu, 14 Dec 2023 11:26:11 +0000 (16:56 +0530)]
LU-16796 ptlrpc: Change struct ptlrpc_request_set to use kref

This patch changes struct ptlrpc_request_set to use
kref instead of atomic_t

Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: Icd8dc9d532121b9087455b951a1b7ee922ab532c
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53459
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Timothy Day <timday@amazon.com>
7 weeks agoLU-6142 gss: SPDX for GSS 22/57822/3
Timothy Day [Sun, 19 Jan 2025 20:05:13 +0000 (15:05 -0500)]
LU-6142 gss: SPDX for GSS

Convert from verbose license text to SPDX. These files are
largely derived from in-kernel sunrpc GSS code, which derived
it from the Kerberos project.

I've tracked down each file in upstream Linux - and either
applied the correct license to the Lustre version or omitted
the SPDX (as the kernel does) along with an updated full-path
to the Linux kernel file.

Also, add the BSD-3-Clause license text.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I19283cb8d3d625842984b9112014cc58a5a04726
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57822
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-18762 lnet: lst SIGSEGV in parser.c 69/58269/2
Frank Sehr [Fri, 28 Feb 2025 21:49:59 +0000 (13:49 -0800)]
LU-18762 lnet: lst SIGSEGV in parser.c

A null pointer problem. Null pointer was passed instead of command
structure array. Built in additional chack.

Test-Parameters: trivial
Signed-off-by: Frank Sehr <fsehr@whamcloud.com>
Change-Id: I7d4e44170aeb8e44c55de681b6b3def781a0b1bd
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58269
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Manish Regmi <mregmi@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-18753 socklnd: remove unused ksocknal_find_peer() 63/58263/2
Timothy Day [Fri, 28 Feb 2025 06:23:27 +0000 (01:23 -0500)]
LU-18753 socklnd: remove unused ksocknal_find_peer()

Remove ksocknal_find_peer() since it is not called
anywhere.

Fixes: 0d816af574b7 ("LU-11300 lnet: remove lnd_query interface.")
Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Ia9d15882260bf25ebf92d60239f666a6cc97d04a
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58263
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Aurelien Degremont <adegremont@nvidia.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-18754 build: explicitly include openssl/rand.h 31/58231/2
Shaun Tancheff [Wed, 26 Feb 2025 12:04:15 +0000 (19:04 +0700)]
LU-18754 build: explicitly include openssl/rand.h

el10 build fails with:
   error: implicit declaration of function 'RAND_bytes'

Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Ieb1b75fbf7029b712addf9222d412d3bfa91e59e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58231
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Caleb Carlson <caleb.carlson@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-18753 mdt: remove mdt_buf code 28/58228/2
Timothy Day [Wed, 26 Feb 2025 05:49:41 +0000 (00:49 -0500)]
LU-18753 mdt: remove mdt_buf code

Remove unused MDT buffer code, including mdt_buf()
and mdt_buf_const().

Fixes: 26b823865997 ("LU-3105 osd: remove capa related stuff from servers")
Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I7c0ffc820e94886902a1dbf09e01d70761f7d8fd
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58228
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Aurelien Degremont <adegremont@nvidia.com>
Reviewed-by: Max Wang <wamax@amazon.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-18753 obdclass: remove obsolete dt functions 27/58227/3
Timothy Day [Wed, 26 Feb 2025 05:37:57 +0000 (00:37 -0500)]
LU-18753 obdclass: remove obsolete dt functions

Remove:

dt_path_parser()
dt_store_resolve()
dt_store_open()
dt_reg_open()
dt_find_entry()
typedef dt_entry_func_t

These are not called anywhere.

Fixes: 29e98f581ab6 ("LU-2886 obdclass: remove obsoleted md_local_file.c")
Fixes: 29adfde10ff2 ("LU-2886 mdd: create local files using local_storage lib")
Fixes: 90d8e7fd2874 ("Land b_head_interop_disk  on HEAD (20081119_1314)")
Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I5a79f665104c0526db1e328c0e682ce85b592028
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58227
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Aurelien Degremont <adegremont@nvidia.com>
Reviewed-by: Max Wang <wamax@amazon.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-18753 ptlrpc: remove unused stub functions 26/58226/2
Timothy Day [Wed, 26 Feb 2025 05:14:49 +0000 (00:14 -0500)]
LU-18753 ptlrpc: remove unused stub functions

Remove:

ptlrpc_ping_import_soon()
flavor_copy()
ptlrpc_cleanup_client()
__lustre_swab_buf()

They are not called anywhere.

Fixes: 86b2211e55dc ("LU-290 Reconnects are not throttled")
Fixes: 3565394baa95 ("LU-3289 gss: Add userspace support for GSS null and sk")
Fixes: 3ee0e0908f12 ("LU-5829 ptlrpc: remove unnecessary EXPORT_SYMBOL")
Fixes: 23fad25a5b6b ("b=18631")
Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Ib45aba0b76d086b0a657bb3fc79d1ec74b1e3302
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58226
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Aurelien Degremont <adegremont@nvidia.com>
Reviewed-by: Max Wang <wamax@amazon.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-18753 ptlrpc: remove ptlrpcd_add_rqset() 25/58225/2
Timothy Day [Wed, 26 Feb 2025 05:07:50 +0000 (00:07 -0500)]
LU-18753 ptlrpc: remove ptlrpcd_add_rqset()

ptlrpcd_add_rqset() has no callers. Remove it.

Fixes: 03f537c50b76 ("LU-2244 lov: remove unused bits from lov, osc")
Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Ica55c7559c0244af12bf1b47b350f8aeb0398f03
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58225
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Aurelien Degremont <adegremont@nvidia.com>
Reviewed-by: Max Wang <wamax@amazon.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-18751 lnet: Segfault in lnetctl fault command 16/58216/2
Frank Sehr [Wed, 26 Feb 2025 00:24:17 +0000 (16:24 -0800)]
LU-18751 lnet: Segfault in lnetctl fault command

"lnetctl fault reset 0" and similar variations cause a segfault. This
is caused by a null pointer that is not checked in the code.

Test-Parameters: trivial
Signed-off-by: Frank Sehr <fsehr@whamcloud.com>
Change-Id: Iec580b19f97c2a189ae8f29444bf3e3cc91d78a0
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58216
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Manish Regmi <mregmi@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-17778 tests: fix conf-sanity/76d issues 15/58215/2
Andreas Dilger [Sat, 11 Jan 2025 00:16:00 +0000 (17:16 -0700)]
LU-17778 tests: fix conf-sanity/76d issues

There was a race between remounting a client and the MGS parameter
settings being applied, so ensure the parameter setting is rechecked
if it is not correct the first time.

Also, the parameter checking for $MOUNT2 was not necessarily checking
the right instance of the parameter on the client.  Use the instance
name for the mountpoint to ensure this is checked correctly.

Test-Parameters: trivial
Fixes: fa1bff8f6f ("LU-9399 llite: register mountpoint before process llog")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ib6acdc80b880dfc90b0c10406a0f868211433f58
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58215
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Nushafreen Palsetia <npalsetia@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-18749 socklnd: check page for zerocopy 05/58205/2
Yang Sheng [Mon, 10 Feb 2025 19:35:20 +0000 (03:35 +0800)]
LU-18749 socklnd: check page for zerocopy

We should check the page state to ensure kernel
can handle it in zerocopy case.

Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: Ib82989bcca9898ecc176ddc0c9a6cd4eafbad89f
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58205
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-18354 tests: speed up sanity/136 on non-ZFS 99/58199/3
Andreas Dilger [Mon, 24 Feb 2025 20:52:37 +0000 (13:52 -0700)]
LU-18354 tests: speed up sanity/136 on non-ZFS

The previous change to sanity test_136 to improve test reliability
on ZFS servers resulted in the test time increasing by about 8x
(from ~300s to ~2400s).  Only wait for deletion and drop caches on
ZFS MDS nodes, and not on ldiskfs where this is not needed.

Test-Parameters: trivial
Test-Parameters: testlist=sanity env=ONLY=136,SLOW=yes,ONLY_MINUTES=30 fstype=zfs
Test-Parameters: testlist=sanity env=ONLY=136,SLOW=yes,ONLY_MINUTES=30
Fixes: 627cc62369 ("LU-18354 tests: avoid sanity/136 OOM on ZFS servers")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ic5dc79f9b7e6c2df50a97d0447ef3aa9d3c73e1d
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58199
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Timothy Day <timday@amazon.com>
2 months agoLU-18744 tests: fix sanity-sec test_27ab 87/58187/3
Sebastien Buisson [Mon, 24 Feb 2025 12:05:54 +0000 (13:05 +0100)]
LU-18744 tests: fix sanity-sec test_27ab

When nodemap is activated in sanity-sec test_27ab, we need to make
sure the default nodemap grants root access, so that clients and
servers can be stopped and restarted.
Also fix an incorrect call to 'lctl nodemap_add_idmap'.

Test-Parameters: trivial
Fixes: e3051ad0f1 ("LU-18109 utils: adding nodemap offset capability")
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I0adafae67c7637c616c687590bd01ff12f4d6bf2
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58187
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-16518 mdt: fix unused-but-set-variable warnings 84/58184/2
Timothy Day [Mon, 24 Feb 2025 06:38:13 +0000 (01:38 -0500)]
LU-16518 mdt: fix unused-but-set-variable warnings

Remove unused variables in various parts of the MDT code.
This silences the Clang compiler -Wunused-but-set-variable
warning.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I2c42e9ce86ac854a49f8b12f5325ce1f34b8ecc3
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58184
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Max Wang <wamax@amazon.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-16518 target: fix unused-but-set-variable warnings 83/58183/2
Timothy Day [Mon, 24 Feb 2025 06:21:23 +0000 (01:21 -0500)]
LU-16518 target: fix unused-but-set-variable warnings

In tgt_checksum_niobuf(), remove the unused err return code
of cfs_crypto_hash_final() to silence a Clang compiler
warning.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I6dfe664479d4430c4386d2ff50644e41d91a4c28
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58183
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Max Wang <wamax@amazon.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 months agoLU-16518 osc: fix unused-but-set-variable warnings 82/58182/2
Timothy Day [Mon, 24 Feb 2025 06:18:11 +0000 (01:18 -0500)]
LU-16518 osc: fix unused-but-set-variable warnings

When CONFIG_CRC_T10DIF=n and osc_checksum_bulk_t10pi() is
a macro, Clang generates compiler warnings for some of the
arguments - since they are not used elsewhere. Silence this
by creating a proper function.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I502dcf1764602711fcf2cf3553ad6d2f4fed3f14
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58182
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Max Wang <wamax@amazon.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 months agoLU-16518 lnet: fix unused-but-set-variable warnings 81/58181/2
Timothy Day [Mon, 24 Feb 2025 06:12:08 +0000 (01:12 -0500)]
LU-16518 lnet: fix unused-but-set-variable warnings

Remove unused primary_nid variable in lnet_peer_merge_data()
to silence a Clang compiler warning.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Ib66fd31c7acc08fa66578cd7ab571f278f98afe1
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58181
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Max Wang <wamax@amazon.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 months agoLU-16518 ldlm: fix unused-but-set-variable warnings 80/58180/2
Timothy Day [Mon, 24 Feb 2025 01:53:27 +0000 (20:53 -0500)]
LU-16518 ldlm: fix unused-but-set-variable warnings

In ldlm_flock_completion_ast(), obd* is not being
used. Remove it to silence the Clang compiler warning.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Idc8d0fd9a351b4bb328a0956eabd2026460cdfe1
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58180
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Max Wang <wamax@amazon.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 months agoLU-16518 obdclass: fix unused-but-set-variable warnings 79/58179/2
Timothy Day [Mon, 24 Feb 2025 01:50:28 +0000 (20:50 -0500)]
LU-16518 obdclass: fix unused-but-set-variable warnings

Remove swab variable in class_config_yaml_output()
that is not being used.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I0db780c98fe34cef91988fc3f8c2ccac9481de2c
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58179
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Max Wang <wamax@amazon.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 months agoLU-16518 llite: fix unused-but-set-variable warnings 78/58178/2
Timothy Day [Mon, 24 Feb 2025 01:47:17 +0000 (20:47 -0500)]
LU-16518 llite: fix unused-but-set-variable warnings

Remove unused variables in llite. Clang does not like
this, so disard them.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Icb555135fc5ee53c2a7b2819beed4a78fe89b91d
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58178
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Max Wang <wamax@amazon.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-16518 pcc: fix unused-but-set-variable warnings 77/58177/2
Timothy Day [Mon, 24 Feb 2025 01:25:30 +0000 (20:25 -0500)]
LU-16518 pcc: fix unused-but-set-variable warnings

In several places, pcc_file is set but never used.
Clang doesn't like this, so discard this variable.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I7fde9d49bd6309335e6f9083ba588fc86d495b1c
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58177
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Max Wang <wamax@amazon.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 months agoLU-17995 obdclass: remove obdname2fsname() 74/58174/2
Timothy Day [Sun, 23 Feb 2025 18:26:59 +0000 (13:26 -0500)]
LU-17995 obdclass: remove obdname2fsname()

This function is not called anywhere.

Test-Parameters: trivial
Test-Parameters: trivial fstype=zfs
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Ib6c92787685564e812634c8a466b9edc27ba6977
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58174
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Max Wang <wamax@amazon.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-17995 osd-zfs: remove server_name_is_ost() 63/58163/2
Timothy Day [Sat, 22 Feb 2025 17:54:10 +0000 (12:54 -0500)]
LU-17995 osd-zfs: remove server_name_is_ost()

We can determine whether a server name is for an
OST by checking the return code of server_name2index().

Test-Parameters: trivial
Test-Parameters: trivial fstype=zfs
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I2e1c337b9095d333772f87bd2a5253966b54bd45
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58163
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-18516 quota: use wait_woken for qsd_op_begin0() 56/58156/3
James Simmons [Sat, 22 Feb 2025 12:56:17 +0000 (07:56 -0500)]
LU-18516 quota: use wait_woken for qsd_op_begin0()

Kernels with debugging enabled report for Lustre quota handling:

do not call blocking ops when !TASK_RUNNING;
? __might_sleep+0x9d/0xc0
  down_read_nested+0x2e/0x4b0
  lquota_disk_read+0x46e/0x800 [lquota]
  qsd_refresh_usage+0x105/0x3d0 [lquota]
  qsd_acquire+0xbe/0x7c0 [lquota]
  qsd_op_begin0+0x5f8/0xc80 [lquota]

This is due to qsd_acquire() performing operations that can sleep while
the kthread is in an idle state. The Linux kernel solution for this
is wait_woken(). Move the function qsd_op_begin0() from using
wait_event_idle_timeout() to wait_woken(). This will resolve the
potential sleeping issues.

Test-Parameters: trivial testlist=sanity-quota
Change-Id: Id2b7a5886869bf0ed3d560e159524dcda841d8b0
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58156
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Sergey Cheremencev <scherementsev@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-8289 utils: fix ll_decode_linkea doc 28/58128/2
Sohei Koyama [Wed, 19 Feb 2025 07:24:42 +0000 (16:24 +0900)]
LU-8289 utils: fix ll_decode_linkea doc

Arguments "#123451" and "#123452" in example were hidden by "\".

Test-Parameters: trivial
Signed-off-by: Sohei Koyama <skoyama@ddn.com>
Change-Id: I059abac920fcc5ecfe03eddc00fef1dd6d89db27
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58128
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Frederick Dilger <fdilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-9633 llite: Add kernel doc style for lustre/llite/*.c (Part 4) 00/58100/2
Arshad Hussain [Mon, 17 Feb 2025 11:11:26 +0000 (16:41 +0530)]
LU-9633 llite: Add kernel doc style for lustre/llite/*.c (Part 4)

This patch converts existing functional comments
to kernel doc style comments and removes '/**' for
comments which is not meant to be a kernel-doc comment

Tested with:
<kernel src path>/scrips/kernel-doc -v -none <file>

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I1b0eae8e684d96b843cd5da15d6ed2ef944ad9d2
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58100
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Anjus George <georgea@ornl.gov>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-9633 llite: Add kernel doc style for lustre/llite/*.c (Part 3) 98/58098/2
Arshad Hussain [Mon, 17 Feb 2025 08:15:42 +0000 (13:45 +0530)]
LU-9633 llite: Add kernel doc style for lustre/llite/*.c (Part 3)

This patch converts existing functional comments
to kernel doc style comments and removes '/**' for
comments which is not meant to be a kernel-doc comment

Tested with:
<kernel src path>/scrips/kernel-doc -v -none <file>

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I360e4d93d161e17172095b638cbf3628791c35a6
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58098
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Anjus George <georgea@ornl.gov>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-18723 hsm: sanity-hsm 500 hung in llapi_hsm_copytool_recv 84/58084/5
Sebastien Buisson [Fri, 14 Feb 2025 09:16:56 +0000 (17:16 +0800)]
LU-18723 hsm: sanity-hsm 500 hung in llapi_hsm_copytool_recv

sanity-hsm hung in test_500 in llapi_hsm_test test100.
The bug can be reproduced by the following test script:
ONLY="411 500" REFORMAT=yes ./sanity-hsm.sh

The reason is that the previous test case 411 does not cleanup
clearly and failed to unregister the HSM agent due to the
permission under the active rbac role and return -EPERM:
mdt_hsm_ct_unregister() {
...
if (!mdt_hsm_is_admin(info))
GOTO(out, rc = -EPERM);
...

This bug can easily be solved by making sure nodemap is always removed
before the copytool is cleaned up.

Test-Parameters: trivial testlist=sanity-hsm
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I093775eeaf39b4d2671e3a05e41f33a9e1d8ec5e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58084
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: Robert Read <rread@ddn.com>
Reviewed-by: Robert Read <rread@ddn.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-18705 pcc: only reset file mapping for valid cached file 99/57999/3
Qian Yingjin [Thu, 6 Feb 2025 12:41:59 +0000 (20:41 +0800)]
LU-18705 pcc: only reset file mapping for valid cached file

It should only reset and revise file mapping for the valid cached
file (@cached == true) in pcc_file_mapping_reset().

Otherwise, it will cause sanity-pcc test_97 panic as follows:
(pcc.c:3077:pcc_vma_file_reset())
ASSERTION( vma->vm_file->f_mapping == inode->i_mapping ) failed:
panic+0x114/0x2f6
lbug_with_loc.cold+0x30/0x69 [libcfs]
pcc_mmap_io_init+0xafe/0xd60 [lustre]
pcc_fault+0x170/0x3d0 [lustre]
ll_fault+0x43/0x9a0 [lustre]
__do_fault+0x3c/0x170
do_fault+0x24b/0x640

Test-Parameters: testlist=sanity-pcc env=ONLY=97,ONLY_REPEAT=50
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I7e9cd0cba9d230160c90a32bef452139c23164b3
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57999
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 months agoLU-18750 kernel: update RHEL 9.5 [5.14.0-503.26.1.el9_5] 12/58212/2
Jian Yu [Tue, 25 Feb 2025 19:30:32 +0000 (11:30 -0800)]
LU-18750 kernel: update RHEL 9.5 [5.14.0-503.26.1.el9_5]

Update RHEL 9.5 kernel to 5.14.0-503.26.1.el9_5.

Test-Parameters: trivial fstype=ldiskfs mdtcount=4 mdscount=2 \
  clientdistro=el9.5 serverdistro=el9.4 testlist=sanity

Test-Parameters: trivial fstype=ldiskfs mdtcount=4 mdscount=2 \
  clientdistro=el9.4 serverdistro=el9.5 testlist=sanity

Test-Parameters: optional fstype=ldiskfs mdtcount=4 mdscount=2 \
  clientdistro=el9.5 serverdistro=el9.5 testgroup=full-dne-part-1

Test-Parameters: optional fstype=ldiskfs mdtcount=4 mdscount=2 \
  clientdistro=el9.5 serverdistro=el9.5 testgroup=full-dne-part-2

Test-Parameters: optional fstype=ldiskfs mdtcount=4 mdscount=2 \
  clientdistro=el9.5 serverdistro=el9.5 testgroup=full-dne-part-3

Test-Parameters: optional fstype=zfs mdtcount=4 mdscount=2 \
  clientdistro=el9.5 serverdistro=el9.5 testgroup=full-dne-zfs-part-1

Test-Parameters: optional fstype=zfs mdtcount=4 mdscount=2 \
  clientdistro=el9.5 serverdistro=el9.5 testgroup=full-dne-zfs-part-2

Test-Parameters: optional fstype=zfs mdtcount=4 mdscount=2 \
  clientdistro=el9.5 serverdistro=el9.5 testgroup=full-dne-zfs-part-3

Change-Id: Id381dd6628ad738fa23ddbe3746f42457269595f
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58212
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Deiter <adeiter@ddn.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-18566 lnet: dynamically configure timeouts 14/57514/6
Caleb Carlson [Tue, 18 Jun 2024 19:04:42 +0000 (13:04 -0600)]
LU-18566 lnet: dynamically configure timeouts

Add/use default LND timeouts:
* SOCKNAL_TIMEOUT_DEFAULT = 50,
* IBLND_TIMEOUT_DEFAULT   = 50,
* KFILND_TIMEOUT_DEFAULT  = 125,
* GNILND_TIMEOUT_BASE    = 60

LND timeouts default to these if not set by kernel
module params. Return only this value from the
<lnd>_timeout() functions, dropping the call to
lnet_get_lnd_timeout() which was based on the LTT and
LRC values.

Adds lnd_get_timeout() function to the lnet_lnd API
procedural struct, which returns the LND timeout of
whichever LND initialized the struct.

Use this lnd_get_timeout() function to update
the lnet_lnd tunables upon retrieval, to get current
value from module parameters.

For kfilnd, switch to using kfilnd_timeout() instead of
lnet_get_lnd_timeout(). Define KP_PURGE_LIMIT for KFI
peer purge timeout limits.

For lolnd, there's no timeout function definition, so
added conditional logic to check if the timeout function
is valid and returns a positive integer. Also, LNetGet
using the loopback LND creates the message with both
msg_txni and msg_rxni being NULL, so we check for that
condition.

Use control flow for send/recv to find correct msg NI.
Fix formatting of struct array in nidstrings.c.

Add module param path variables for ksocklnd,
kkfilnd, and kgnilnd. Renames the o2ib_modparam
variable to be more consistent:
o2iblnd_modparam_path.

Remove depency on default lnet_lnd_timeout value
in kgnilnd_timeout() function; use tunable value
instead.

Fallback to lnet_get_lnd_timeout() if tunables timeout
value is 0 (or is unset).

Modifies the 'lnetctl net set' command to allow setting
the LND timeout value via:
'lnetctl net set --net <foo> --lnd-timeout <val>'

Renames yaml_lnet_config_ni_healthv to
yaml_lnet_config_ni_value and adds arguments to broaden
the scope of the function.

Fixes bug when setting both --all and --nid for lnetctl net
set not returning -EINVAL.

Adds sanity tests to sanity-lnet.sh that tests
dynamically configured LND timeouts using values
from LND tunables set and display, and tests
that setting the LND tunable timeout value to zero
ends up defaulting to global lnd_timeout value.

Add timeout get functionality for netlink to kfilnd.

Signed-off-by: Caleb Carlson <caleb.carlson@hpe.com>
HPE-bug-id: LUS-12342
Test-Parameters: testlist="sanity-lnet"
Change-Id: Ic69a7d9d6af4cfed65d07caaf87d8b78238beab0
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57514
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-18538 ldlm: use bitmap for NS flags 86/57386/5
Timothy Day [Thu, 12 Dec 2024 05:40:32 +0000 (00:40 -0500)]
LU-18538 ldlm: use bitmap for NS flags

Use a bitmap for namespace flags in LDLM. Consolidate two
bit fields into a single bitmap. This is more in line with
Linux kernel style and more correct.

Fixes: 70b9dc5 ("LU-17812 ldlm: stack trace log for LDLM error")
Fixes: 3d4b5da ("LU-11518 ldlm: cancel LRU improvement")
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I50dd21d064147db1a93edb2e582db29c26b1c211
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57386
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-16897 tgt: note 'hole' pages 97/53297/13
Patrick Farrell [Thu, 30 Nov 2023 16:54:34 +0000 (11:54 -0500)]
LU-16897 tgt: note 'hole' pages

In order to do sparse reads, we must know which pages
correspond to holes, so we note this when the page is read
from disk.

Note something unusual: We store the hole information in
the lnb, which is a per-IO struct.  This means the hole
information is not present when a page is reused in cache.

So when a region with a hole is first read from disk, the
hole annotation is available for the transfer code, but if
the page cache is in use, this information is not available
on subsequent reads from the same pages.

This can't be avoided because the server does not have any
per-page private information for page cache pages (and ZFS
would not support this).

This isn't too costly for two reasons:
1. We default page cache to off on flash systems
2. Most data is only read once rather than many times in
quick succession

NB: It's not clear how we can efficiently get hole
information from ZFS so this is only for ldiskfs for now.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I54b1b0abeb6889163f36b315292d8b6e760d6f78
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53297
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-18475 build: compatibility updates for kernel 6.12 25/57125/7
Shaun Tancheff [Sat, 21 Dec 2024 11:01:03 +0000 (16:31 +0530)]
LU-18475 build: compatibility updates for kernel 6.12

Linux commit v6.6-rc2-11-gd77008421afd
 groups: Convert group_info.usage to refcount_t
Provide wrappers to inc/dec group_info.usage

Linux v6.12-rc1-3-g5f60d5f6bbc1
 move asm/unaligned.h to linux/unaligned.h
Add a configure test to determine which header to use

Linux v6.11-rc1-51-ga225800f322a
 fs: Convert aops->write_end to take a folio
Linux v6.11-rc1-52-g1da86618bdce
 fs: Convert aops->write_begin to take a folio
Add 'struct folio' for page vs folio signature change.

Linux v6.11-rc4-27-g11068e0b64cb
  fs: remove f_version
f_version is removed, conditionally ignore it.

Linux v6.11-rc6-86-g09022bc196d2
  mm: remove PG_error
PG_error flag and PageError wrappers are removed.

Linux v6.11-rc6-233-g99f86bbda317
  mm: remove PageMlocked
PageMLocked wrappers are removed

Linux v6.11-rc6-225-ge880034cf718
  mm: introduce page_mapcount_is_type()
PAGE_MAPCOUNT_RESERVE is removed and page_mapcount_is_type()
is used instead.

Test-Parameters: trivial
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I43928749e017c95edcbba9469550c33b00160e16
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57125
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-18743 llite: inode_to_wb() needs locking 61/58161/3
James Simmons [Sat, 22 Feb 2025 15:10:24 +0000 (10:10 -0500)]
LU-18743 llite: inode_to_wb() needs locking

When running a kernel with lockdep turned on testing shows the
following error:

WARNING: CPU: 1 PID: 37 at include/linux/backing-dev.h:291 ll_writepages+0x3dd/0x400 [lustre]
Workqueue: writeback wb_workfn (flush-lustre-ffff8f09f4)
RIP: 0010:ll_writepages+0x3dd/0x400 [lustre]
Call Trace: [ 1267.032775] ? show_regs.cold.9+0x22/0x2f
 ? __warn+0xc8/0x150 [ 1267.043623] ? ll_writepages+0x3dd/0x400 [lustre]

This due to inode_to_wb() being called without a lock. We can
pick from 3 types of locks but I went with the inode i_lock.

Change-Id: I7427041d6df102161c06cfbb05b7e26428675225
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58161
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-18639 dne: a correct check for dir split 84/57784/11
Alexander Zarochentsev [Tue, 21 Jan 2025 20:10:22 +0000 (20:10 +0000)]
LU-18639 dne: a correct check for dir split

Use the actual dir stripe count while performing
a dir split sanity check in lod_dir_declare_dir_split().

Fix lod_object_lock() to work with a striped dir with
only one stripe correctly.

Improve sanity test_230p by adding a dir split right
after the dir merges.

Also fix a typo in lustre/doc/lfs-migrate.1 .

Fixes: 2e2b16c28b ("LU-11025 dne: support directory restripe")
Fixes: 392f558f40 ("LU-17810 dne: dir restripe without fixed hash flag")
HPE-bug-id: LUS-12701
Test-Parameters: envdefinitions=ONLY=230p fstype=ldiskfs mdtcount=2 mdscount=2 testlist=sanity
Signed-off-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Change-Id: I8d8501fd09f89d03ccb1ea92a8562326110ecc24
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57784
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-18446 ptlrpc: lower CPUs latency during client I/O 39/57039/12
Bruno Faccini [Fri, 15 Nov 2024 09:24:08 +0000 (10:24 +0100)]
LU-18446 ptlrpc: lower CPUs latency during client I/O

Some CPUs with power-management can suffer with high
latency to exit from idle state.
This can have a strong impact on Lustre client perfs.
Use PM-QoS framework to guarantee usage of low-latency
power management mode, for CPUs/Cores known to be
involved to handle RPC replies for Lustre I/Os
completion.

Added PM-QoS configure checks:

PM-QoS framework is present since Kernel v3.2.
DEV_PM_QOS_RESUME_LATENCY was on DEV_PM_QOS_LATENCY before v3.15.

to handle all these cases for older kernels compatibility.

Add 4 tuneables :
  _ 'enable_pmqos' to enable/disable using PM-QoS to
    bump CPUs latency
  _ 'pmqos_latency_max_usec' to allow modifying the max
    latency value to be used
  _ 'pmqos_default_duration_usec' to allow modifying
    the timeout value to unset low latency
  _ 'pmqos_use_stats_for_duration to enable/disable
    using the per-target stats to set low latency timeout

Here is a table summarising the single node fio (randread)
performance :
NJOBS Target perf Original perf perf with patch
1           2.5              1.05            2.56
2           5.24             2.14            5.26
4           10.8             4.36            10.5
8           21.3             8.68            20.9
16          40               16.9            40
32          65.4             32.2            64.1
64          84               56.8            83.4
128         90.8             79.6            89.9
192         91.7             85.2            91.5
256         91.9             87.4            91.8
320         91.8             89.7            91.9

Signed-off-by: Bruno Faccini <bfaccini@nvidia.com>
Change-Id: I784a699f355da413db5029c6c7584ce3ee4ba9e1
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57039
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-18738 utils: avoid statx() of root of mounted FS 35/58135/2
Olaf Faaland [Tue, 18 Feb 2025 04:46:38 +0000 (20:46 -0800)]
LU-18738 utils: avoid statx() of root of mounted FS

When looking for a specific mounted lustre file system by path, avoid
the stat() or statx() call on lustre file systems whose mountpoints do
not match the given path.

This avoids hangs if the client is disconnected from MDT0 of other
mounted file systems, but the desired file system is reachable.

Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Change-Id: I1c67214f107ae2afe34d050470155807063bda51
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58135
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-18727 target: reference transaction early 12/58112/3
Alex Zhuravlev [Tue, 18 Feb 2025 09:14:08 +0000 (12:14 +0300)]
LU-18727 target: reference transaction early

grab a reference to distrubuted transaction handle before putting on
the list, otherwize the concurrent commit thread can find it and
release, thus the original thread will be accessing just freed
structure.

Test-Parameters: testlist=replay-single,replay-dual
Test-Parameters: testlist=replay-single,replay-dual
Test-Parameters: testlist=replay-single,replay-dual
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I12546b1bce3b3f0fe3c74a99bb589bee39768754
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58112
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
2 months agoLU-18392 tests: watch destroys_in_flight recovery_small/160 96/58096/3
Li Dongyang [Mon, 17 Feb 2025 03:37:55 +0000 (14:37 +1100)]
LU-18392 tests: watch destroys_in_flight recovery_small/160

In recovery_small/160, we sleep and check for destroys_in_flight
and make sure the number is low, which indicates destroys are not
blocked.

However there's a 10s timeout, the destroy rpcs could be retried
and before they got put back on error_list again, the destroys_in_flight
could be bumped up, if the test case happen to check destroys_in_flight
during this window the test case could fail.

Use wait_update_cond to watch for the expected drop.

Change-Id: I0b29a90e4c78e80a0b5a522d57ed97db1b698364
Test-Parameters: trivial testlist=recovery-small env=ONLY=160,ONLY_REPEAT=100
Fixes: 27f787daa7 ("LU-15737 ofd: don't block destroys")
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58096
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 months agoLU-8066 osc: move complex proc to debugfs 90/58090/2
James Simmons [Fri, 14 Feb 2025 17:50:47 +0000 (12:50 -0500)]
LU-8066 osc: move complex proc to debugfs

Time to pull the trigger and move the complex proc files
to debugfs for the osc layer.

Change-Id: I719362a6885fd9272ad92249e089ed059dc734b5
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58090
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-8066 mdc: move complex proc to debugfs 89/58089/3
James Simmons [Fri, 14 Feb 2025 17:54:42 +0000 (12:54 -0500)]
LU-8066 mdc: move complex proc to debugfs

Time to pull the trigger and move the complex proc files
to debugfs for the mdc layer.

Change-Id: Id4cc8a53ab581ec6a56578d6d23b3541946933e5
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58089
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-8066 ldlm: move debugfs entries to general sysfs 87/58087/3
James Simmons [Fri, 14 Feb 2025 16:05:20 +0000 (11:05 -0500)]
LU-8066 ldlm: move debugfs entries to general sysfs

Any simple debugfs files in the top ldlm directory should be
moved to /sys/fs/lustre/ldlm. This ensures these files will
always be available which is not the case for debugfs.

Change-Id: I851923894d8b6c594056d71a891364d5f54988a1
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58087
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-18721 kernel: update SLES15 SP6 [6.4.0-150600.23.38.1] 80/58080/2
Jian Yu [Fri, 14 Feb 2025 06:00:46 +0000 (22:00 -0800)]
LU-18721 kernel: update SLES15 SP6 [6.4.0-150600.23.38.1]

Update SLES15 SP6 kernel to 6.4.0-150600.23.38.1 for Lustre client.

Test-Parameters: trivial mdtcount=4 mdscount=2 \
  clientdistro=sles15sp6 testlist=sanity

Test-Parameters: optional mdtcount=4 mdscount=2 \
  clientdistro=sles15sp6 testgroup=full-dne-part-1

Test-Parameters: optional mdtcount=4 mdscount=2 \
  clientdistro=sles15sp6 testgroup=full-dne-part-2

Test-Parameters: optional mdtcount=4 mdscount=2 \
  clientdistro=sles15sp6 testgroup=full-dne-part-3

Change-Id: I56a4c37f2316c9180d07b8633db4334f59e09fce
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58080
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Alex Deiter <adeiter@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-18680 ptlrpc: Improve yaml output for nodemap offset 85/58085/2
Marc Vef [Fri, 14 Feb 2025 11:14:22 +0000 (12:14 +0100)]
LU-18680 ptlrpc: Improve yaml output for nodemap offset

Nodemap offset presents loose values on lctl get_param:
$ lctl get_param -n nodemap.t0.offset
start_uid: 100000
limit_uid: 200000
start_gid: 100000
limit_gid: 200000
start_projid: 100000
limit_projid: 200000

This is technically valid YAML syntax, but not contextually bound as a
unit or to its key "offset". In the future, we want to save/restore
entire nodemap configurations. This means that the key "offset" is
necessary to relate the values to. This can be done via indenting or
more explicitly with {}, as done in this patch to be consistent with
other nodemap properties like ranges or idmap and to not necessarily
rely on indentation:
$ lctl get_param -n nodemap.t0.offset
{
 start_uid: 100000,
 limit_uid: 200000,
 start_gid: 100000,
 limit_gid: 200000,
 start_projid: 100000,
 limit_projid: 200000
}

Note, offsets are always shown even for zero, thus {} is not
shown when no offsets are defined.

Test-Parameters: trivial testlist=sanity-sec env=ONLY=27ab
Signed-off-by: Marc Vef <mvef@whamcloud.com>
Change-Id: I9b9a5411c3adcd39d45f29c8fc1d8c51163ba9b5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58085
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Aurelien Degremont <adegremont@nvidia.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 months agoLU-9859 libcfs: migrate hash table to obdclass 45/57945/5
James Simmons [Mon, 3 Feb 2025 17:07:27 +0000 (12:07 -0500)]
LU-9859 libcfs: migrate hash table to obdclass

The lustre specific hash is only used by the Lustre layer so it
doesn't make sense to keep it in libcfs. Move the hash code to
obdclass. Once ldlm resource hashtable moves to rhashtabble this
be server only code.

Test-Parameters: trivial
Change-Id: I676f6a3decf17e0e90cd747ad4bb4c4d16a52a30
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57945
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-18211 build: add e2fsprogs-devel to dkms dependency 34/57934/3
Andreas Dilger [Fri, 31 Jan 2025 07:39:23 +0000 (23:39 -0800)]
LU-18211 build: add e2fsprogs-devel to dkms dependency

The e2fsprogs-devel package is needed to build server packages
with DKMS, so add it to the Requires: list, since RHEL defaults
to XFS these days and it may not always be installed.

Test-Parameters: trivial testgroup=full-dkms
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I1d5ef22d2caa54821dc55d401beff1f491300c1e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57934
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 months agoLU-16641 tests: 12b: unlink recently created files 29/57929/5
Sergey Cheremencev [Wed, 29 Jan 2025 22:22:15 +0000 (01:22 +0300)]
LU-16641 tests: 12b: unlink recently created files

It is possible that createmany hasn't created requested
number of files on mdt0. So remove only the number of
successfully created files instead of requested.

Fix error_ignore to avoid default error behaviour.

Fixes: 25896b8b88 ("LU-16641 tests: fix sanity-quota_12b")
Signed-off-by: Sergey Cheremencev <scherementsev@ddn.com>
Change-Id: Ic6d5a02295c73fbed0773408c67a47850dee1f80
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57929
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-18680 ptlrpc: improve nodemap lproc syntaxes 15/57915/7
Aurelien Degremont [Mon, 20 Jan 2025 14:07:59 +0000 (15:07 +0100)]
LU-18680 ptlrpc: improve nodemap lproc syntaxes

- 'idmap='

Fix bad YAML syntax when only gid or projid were set, but no uid mapping.
Now simply a pair of brackets when empty, no more multiple or empty lines.

  idmap=[]

- 'exports='

Now each entry is on a separate line, likewise 'idmap'. While still
YAML compliant, this is easier to read for admins. The final ',' has been
cleared.
Simply a pair of brackets when empty, no more multiple or empty lines.

  exports=[]
  exports=[
   { nid: 172.16.0.1@tcp, uuid: 1d49406a-68eb-4d54-ae08-3587d6a6b078 },
   { nid: 172.16.0.1@tcp, uuid: 48ed3108-de34-11ef-bd15-670a7bd749aa }
  ]

- 'ranges='

Now simply a pair of brackets when empty, no more multiple or empty lines.
Still one entry per line when populated.

  ranges=[]

Change-Id: Ib1711614a825dc3bbb2b8861a61461fdea4e4f4b
Signed-off-by: Aurelien Degremont <adegremont@nvidia.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57915
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Marc Vef <mvef@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 months agoLU-18621 lnet: fix wrong stats for lnet_net_show_dump 85/57885/2
James Simmons [Fri, 24 Jan 2025 15:06:50 +0000 (10:06 -0500)]
LU-18621 lnet: fix wrong stats for lnet_net_show_dump

For the function lnet_get_ni_stats() the im_idx assumes the idx
is relative to all NIs. With lnet_net_show_dump() the idx we
use is realtive to our own NI list so it doesn't always match
the idx lnet_get_ni_stats() expects. This can lead to the wrong
stats being collected when a net_id is set. The only reason we
need an idx for lnet_get_ni_stats() is so the NI can be located
but we already know the NI. Just call lnet_usr_translate_stats()
directly instead of using lnet_get_ni_stats() to figure out the
NI to call lnet_user_translate_stats() internally with.

Test-Parameters: trivial testlist=sanity-lnet
Fixes: 8f8f6e2f36e ("LU-10003 lnet: use Netlink to support old and new NI APIs.")
Change-Id: Ie39e65146c21d976f9a7655eead8c46e9293ee27
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57885
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-18610 obdclass: add job expired flag 16/57616/7
Shaun Tancheff [Wed, 5 Feb 2025 16:56:47 +0000 (23:56 +0700)]
LU-18610 obdclass: add job expired flag

In lprocfs_job_cleanup() expired jobs are de-referenced before
being removed from the lru to defer holding a spinlock.
This opens a race where a job can be put multiple times
when only a single put on expiry is expected. To avoid a double
de-reference race use a bit flag to avoid the extra de-reference
on jobs in the process of being expired and removed.

HPE-bug-id: LUS-12670
Test-Parameters: testlist=sanity env=ONLY=205,ONLY_REPEAT=100
Fixes: cad59b9b72 ("LU-18351 obdclass: jobstat scaling")
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Ia7dc91cac313919827cc13db971ffb3debe318c2
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57616
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
2 months agoLU-18618 ofd: handle filter_fid properly on server 78/57678/31
Bobi Jam [Wed, 8 Jan 2025 12:38:16 +0000 (20:38 +0800)]
LU-18618 ofd: handle filter_fid properly on server

The object's filter_fid could be changed via out_xattr_set(), then the
ofd_object::ofo_ff is out of date, the old ofo_ff layout version could
be a bigger value than that of the merged file, so write/punch upon
this OFD object could be rejected as it prohibit operation with
smaller layout version than that on the disk.

Test-Parameters: testlist=sanity-flr env=ONLY=70a,ONLY_REPEAT=250
Test-Parameters: testlist=sanity-flr env=ONLY=200,ONLY_REPEAT=50
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I8bd8c1af4d7806c3d8e4ab9de5af519381d36060
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57678
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>