Whamcloud - gitweb
fs/lustre-release.git
2 months agoLU-17379 lnet: add LNetPeerDiscovered to LNet API 26/53926/8
Serguei Smirnov [Mon, 5 Feb 2024 20:14:30 +0000 (12:14 -0800)]
LU-17379 lnet: add LNetPeerDiscovered to LNet API

LNetPeerDiscovered is added to allow lustre check
whether the peer has been successfully discovered by LNet
before attempting to open a connection to it.
For example, given a mount command with a list of NIDs,
Lustre can use LNetAddPeer API to initiate discovery on
every candidate first, and later use LNetPeerDiscovered
to select a reachable peer to connect to.

Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: I7c9964148a5a2a24d7889b8b4c2e488a433ca258
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53926
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 months agoLU-17500 qmt: avoid "enforced bit set, but neither" 93/53893/3
Sergey Cheremencev [Fri, 2 Feb 2024 20:07:00 +0000 (23:07 +0300)]
LU-17500 qmt: avoid "enforced bit set, but neither"

Don't call qmt_revalidate_qunit in qmt_set_with_lqe
as it is possible that lqe_enforced bit is not cleared
in case when hard and soft limits are setting to 0.
No reasons to recalculate qunit and edquot when we
set limits to 0. For the case when limits are changed,
qunit and edquot will be calculated below in "dirtied"
branch. So not reasons to do this 2 times.

Patch helps to avoid following error:
LustreError: 21362:0:(qmt_entry.c:746:qmt_adjust_qunit())
  $$$ enforced bit set, but neither hard nor soft limit are set

Signed-off-by: Sergey Cheremencev <scherementsev@ddn.com>
Change-Id: I8f5d9630f43b66ae7ea2be0bf2c735a02e1f6299
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53893
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 months agoLU-17481 mdt: count all opens in mdt.*.md_stats 80/53880/5
Yang Sheng [Thu, 1 Feb 2024 16:31:13 +0000 (00:31 +0800)]
LU-17481 mdt: count all opens in mdt.*.md_stats

Count all of opens for mdt. Also add a test case to
verify it.

Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: I2fa90cc2b4ce8d7d039736a5f40a70cbeb04bf8c
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53880
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-17454 nodemap: allow mapping for root 70/53870/2
Sebastien Buisson [Wed, 31 Jan 2024 14:40:44 +0000 (15:40 +0100)]
LU-17454 nodemap: allow mapping for root

Allow an id mapping for root, to match what is implemented for regular
users, with the following behavior:
- if admin property is set, root remains root.
- if admin property is not set, the idmap for '0' is taken into
  account.
- if admin property is not set and there is no idmap for '0' and
  deny_unknown property is not set, root is squashed to the squash
  uid/gid.
- if admin property is not set and there is no idmap for '0' and
  deny_unknown property is set, root is blocked.

Note that map_mode remains ignored for root. Also, capabilities are
not dropped for root when mapped, just like it is done for regular
users. If admins want to drop root capabilities, root must be
squashed.

sanity-sec test_15 is updated to test root mapping.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Id2e950b99e3b3ba27179408c647e1f7b7c49e32e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53870
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-13257 llite: Disallow users to set/clear group lock flag 82/53782/3
Matt Ezell [Tue, 23 Jan 2024 15:40:52 +0000 (18:40 +0300)]
LU-13257 llite: Disallow users to set/clear group lock flag

Group locks are created/freed via dedicated ioctls. Disallow manually
setting or clearing the flag.

HPE-bug-id: LUS-12078
Signed-off-by: Vitaly Fertman <vitaly.fertman@hpe.com>
Signed-off-by: Matt Ezell <ezellma@ornl.gov>
Change-Id: Id5022cc02a7bdce2f0150592470e8336b4537a61
Reviewed-on: https://es-gerrit.hpc.amslabs.hpecorp.net/162708
Reviewed-by: Andriy Skulysh <andriy.skulysh@hpe.com>
Reviewed-by: Vitaly Fertman <vitaly.fertman@hpe.com>
Tested-by: Alexander Lezhoev <alexander.lezhoev@hpe.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53782
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 months agoLU-17453 llite: use dget_parent to access dentry.d_parent 57/53757/10
Shaun Tancheff [Mon, 5 Feb 2024 06:47:49 +0000 (13:47 +0700)]
LU-17453 llite: use dget_parent to access dentry.d_parent

Use dget_parent() to aquire the d_parent member of a dentry
to ensure dentry is valid while it is accessed.

HPE-bug-id: LUS-11889
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Icb0a25ece5a3a3d50da076708fcd631176652a1b
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53757
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-17441 mdc: use MDS_IO_PORTAL for rename 25/53725/8
Andreas Dilger [Thu, 18 Jan 2024 09:49:48 +0000 (02:49 -0700)]
LU-17441 mdc: use MDS_IO_PORTAL for rename

Some workloads like Apache Spark are very rename intensive, and there
here may be many concurrent renames that need the BFL lock (more than
the number of MDS_REQUEST_PORTAL service threads), they will block
these threads until each is able to get the rename lock, and prevent
other MDS_REINT RPCs from being processed.

Since the MDS_IO_PORTAL is often unused (only needed for DoM files),
and has existed since 2.11.0, it seems possible to move the rename
RPCs to be serviced by the MDS_IO_PORTAL threads to avoid contention
on the primary MDS service threads. Also, it will avoid blocking
normal file open, setattr, statfs, and other common operations if the
BFL lock is contended. Even with DoM files they may have read-on-open
handling and only DoM writes would be blocked by the uncommon rename.

Test-Parameters: testlist=sanity serverversion=2.15 \
env=SANITY_EXCEPT="56x 56xa 56xc 65p 70a 119h 119i 123g 123h 123i 398d 398o"
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I623a27de1482778f3c9fc6bb5bbcf917611dc75b
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53725
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
2 months agoLU-17415 ldlm: lock conversion to skip cancelled locks 45/53645/2
Alex Zhuravlev [Thu, 11 Jan 2024 05:28:40 +0000 (08:28 +0300)]
LU-17415 ldlm: lock conversion to skip cancelled locks

ldlm_cli_inodebits_convert() should re-check the lock so it's
not being cancelled to skip such locks and avoid an assertion:

LustreError:
15208:0:(ldlm_lock.c:1095:ldlm_grant_lock_with_skiplist())
ASSERTION( ldlm_is_granted(lock) ) failed:

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: If212931d8fa6a2d8f56c44714de830d5fb4a9a6b
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53645
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-17357 mgc: wait for sptlrpc config log 23/53423/19
Sebastien Buisson [Tue, 12 Dec 2023 16:49:49 +0000 (17:49 +0100)]
LU-17357 mgc: wait for sptlrpc config log

The sptlrpc config log is mandatory to establish connections to
targets with proper security context. So wait for its retrieval.

Add sanity-sec test_68 to exercise this, and improve test_32
for mgssec.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I5352e926dc6a9a68db1224629c68a42b74bee8a4
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53423
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 months agoLU-17317 sec: add srpc_serverctx proc file 76/53376/7
Sebastien Buisson [Tue, 5 Dec 2023 13:14:58 +0000 (14:14 +0100)]
LU-17317 sec: add srpc_serverctx proc file

GSS srpc contexts for client connections can already be dumped via
proc file <mdc,osc>.*.srpc_contexts.
This patch adds a new proc file to dump server side GSS srpc contexts,
e.g.:
mgs.MGS.gss.srpc_serverctx
mdt.testfs-MDT0000.gss.srpc_serverctx
obdfilter.testfs-OST0000.gss.srpc_serverctx

The GSS context information is dumped as YAML, with one line per
context, like this:
0000000013221bdf: { peer_nid: 192.168.56.206@tcp, uid: 0, ctxref: 1,
expire: 1707934985, delta: 3401, flags: [uptodate, cached], seq: 0,
win: 2048, key: 00000000, keyref: 0,
hdl: "0x5ae1a771fd57043:0x65a64972fda4e200",
mech: "krb5 (aes256-cts-hmac-sha1-96)" }

Because of this new syntax, sanity-sec test_28 needs to be fixed.

Test-Parameters: trivial
Test-Parameters: kerberos=true testlist=sanity-krb5
Test-Parameters: testgroup=review-dne-selinux-ssk-part-2
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I37da9ffe6dd5884006b36271185a4d7155ead65b
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53376
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Aurelien Degremont <adegremont@nvidia.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-17337 osd: ask for more revoke credits 65/53365/4
Alex Zhuravlev [Tue, 5 Dec 2023 05:20:58 +0000 (08:20 +0300)]
LU-17337 osd: ask for more revoke credits

starting from 4.* kernels JBD2 tracks number of potential
revoked blocks separately from regular journal blocks and
checks a transaction doesn't exceed the declared number.
before extent merging patch a regular block allocation could
free only very limited number of blocks. now with extent
merging when an extent tree is really big and few extents
are inserted in a single transaction, then such an allocation
can exceed default revoke credits (8).
the patch uses number of extent in the transaction to calculate
potential number of revoke records (max tree depth * default).

Fixes: 0f7e6c02a9 ("LU-16843 ldiskfs: merge extent blocks")
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I4967deb56e5aba82b68ffdc91de589fffae6a64a
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53365
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sergey Cheremencev <scherementsev@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 months agoLU-17226 build: create config option for l_getsepol 49/52849/5
Gian-Carlo DeFazio [Thu, 16 Nov 2023 23:05:45 +0000 (15:05 -0800)]
LU-17226 build: create config option for l_getsepol

Add a configuration option for l_getsepol.
l_getsepol is build by default unless the --disable-l_getsepol
option is given to configure.
lustre.spec.in builds l_getsepol by default and has its
dependencies as build requirements.

The implicit configuration check for the dependency
openssl-devel is removed and replaced by a BuildRequires.

Test-Parameters: trivial
Signed-off-by: Gian-Carlo DeFazio <defazio1@llnl.gov>
Change-Id: If71a2a4a524047edbd2b31e6fac7a42f36a030bf
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52849
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-15743 utils: add --xattr option to lfs find 04/52804/11
Thomas Bertschinger [Tue, 17 Oct 2023 20:32:33 +0000 (16:32 -0400)]
LU-15743 utils: add --xattr option to lfs find

This adds a new "[!] --xattr" option to lfs find to enable listing
files that match a given extended attribute. The option takes an
argument in the form "NAME[=VALUE]" where NAME is a regular
expression for the attribute name and VALUE is an optional regular
expression to match the named attribute's value. If the option is
negated, only files that do not match the option are listed.

The provided regular expressions must match the entire name or value,
not just a substring. If only NAME is provided, files will match if
they have an extended attribute matching the name, regardless of the
attribute's contents. The option may be specified multiple times, and
files must match every provided argument in this case.

Signed-off-by: Thomas Bertschinger <bertschinger@lanl.gov>
Change-Id: I7b02e704b741ee30387a827dd5a25a20574cc3df
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52804
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Alexandre Ioffe <aioffe@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 months agoLU-15461 test: add stop file check 37/52737/4
Hongchao Zhang [Thu, 26 Oct 2023 14:08:00 +0000 (22:08 +0800)]
LU-15461 test: add stop file check

Adding the creation checking of "stop file".

Change-Id: I4acd36e61faf4259c2821293ffb7913d4cca76bd
Test-Parameters: trivial testlist=sanity-quota
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52737
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sergey Cheremencev <scherementsev@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-17200 mdt: check object's locality 16/52716/9
Alex Zhuravlev [Mon, 16 Oct 2023 18:22:05 +0000 (21:22 +0300)]
LU-17200 mdt: check object's locality

remote object can disappear while we're getting an ldlm lock for
it. we can't check object's attributes before we're sure it does
exist. so check object's locality first.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I86ad0f3e7c38b0dce51a9fd836ba2293b210fe4f
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52716
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 months agoLU-17161 build: Avoid fortify_memset in OBD_FREE_PTR 59/52559/5
Shaun Tancheff [Wed, 18 Oct 2023 07:17:29 +0000 (02:17 -0500)]
LU-17161 build: Avoid fortify_memset in OBD_FREE_PTR

OBD_FREE_PTR will optionally clear the about to be free()d
memory.

Unfortunately fortify_memset_chk() hits some false positives.

We can use __underlying_memset() if it is defined, to avoid
the fortify_memset_chk.

Test-Parameters: trivial
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Iced53f22b97ed90e0970625c4fcbaa404054c54a
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52559
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-16724 ptlrc: ptlrpc: extend sec bulk functionality 35/52335/9
Artem Blagodarenko [Wed, 11 Oct 2023 21:20:40 +0000 (17:20 -0400)]
LU-16724 ptlrc: ptlrpc: extend sec bulk functionality

Features such as client-side-data-compression and unaligned
direct I/O need page/buffer pools for good performance.

This patch extends sec bulk functionality to allocate different
size buffers. Memory shrinking and other usefull features
should still work as expected.

Signed-off-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Change-Id: I929b4dfdcb0e8197f3804629b000af0d4bd6f2a0
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52335
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 months agoLU-17081 build: compatibility for 6.5 kernels 58/52258/15
Shaun Tancheff [Wed, 7 Feb 2024 03:23:00 +0000 (10:23 +0700)]
LU-17081 build: compatibility for 6.5 kernels

Linux commit v6.4-rc2-29-gc6585011bc1d
  splice: Remove generic_file_splice_read()

Prefer filemap_splice_read and provide alternates for older kernels.

Linux commit v6.4-rc2-30-g3fc40265ae2b
  iov_iter: Kill ITER_PIPE

ITER_PIPE and iov_iter_is_pipe() are removed, provide a replacement
for iov_iter_is_pipe

Linux commit v6.4-rc4-53-g54d020692b34
  mm/gup: remove unused vmas parameter from get_user_pages()

Use vma_lookup() to acquire the vma following get_user_pages()

Linux commit v6.4-rc7-1884-gdc97391e6610
  sock: Remove ->sendpage*() in favour of sendmsg(MSG_SPLICE_PAGES)
Use sendmsg when MSG_SPLICE_PAGES is defined. Provide a wrapper
using sendpage() for older kernels.

HPE-bug-id: LUS-11811
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I95a0954a602c8db08d30b38a50dcd50107c8f268
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52258
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: xinliang <xinliang.liu@linaro.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-17042 target: allow fsmap ioctl 47/52147/2
Li Dongyang [Tue, 29 Aug 2023 05:18:34 +0000 (15:18 +1000)]
LU-17042 target: allow fsmap ioctl

Pass through the FS_IOC_GETFSMAP ioctl to the underlying ldiskfs
so e2freefrag can make use of online query.

Change-Id: Ia4f1fd3c0b02429b247fa71e73b4a95b98b47026
Test-Parameters: trivial
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52147
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-15367 llite: add iotrace to open/release 06/52006/5
Patrick Farrell [Fri, 18 Aug 2023 20:30:26 +0000 (16:30 -0400)]
LU-15367 llite: add iotrace to open/release

Add iotrace to open and release operations.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Idc20a05417398af20dee313531a3573a8aa4e4c0
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52006
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
2 months agoLU-15367 llite: add setattr to iotrace 05/52005/5
Patrick Farrell [Fri, 18 Aug 2023 20:20:47 +0000 (16:20 -0400)]
LU-15367 llite: add setattr to iotrace

Add setattr messages to iotrace.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I10a51285d38e1684ce0ddcc7bb2a0cd90579c96c
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52005
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
2 months agoLU-17527 tests: fix syntax error in test_255a 09/54009/3
Courrier Guillaume [Mon, 12 Feb 2024 15:19:03 +0000 (16:19 +0100)]
LU-17527 tests: fix syntax error in test_255a

The syntax error comes from the fact that the average speed can be less
than 1 (e.g. .85) which means that ${average_cache%.*} is actually empty
which means that the left parameter of < is empty.

This patch fixes the test by using the speedup instead. The test should
compare speedup_cache and speedup_ladvise with lowest_speedup instead of the
average read time.

Test-Parameters: trivial testlist=sanity env=ONLY=255a,ONLY_REPEAT=50
Signed-off-by: Courrier Guillaume <guillaume.courrier@cea.fr>
Change-Id: Ie2cd24f813a0efe65e3391a3fb664b9db39a9f92
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54009
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
2 months agoLU-15784 obdecho: don't panic with run on second mdt 47/47147/25
Alexey Lyashkov [Tue, 26 Apr 2022 15:04:05 +0000 (18:04 +0300)]
LU-15784 obdecho: don't panic with run on second mdt

obdecho should correctly return errors in error situations:
1. connected to devices other than mdd due structure differences.
2. run an operations against of remote objects.

HPe-bug-id: LUS-10913
Signed-off-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Change-Id: I11c524f205533287a9b5724419741dfbad508d29
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/47147
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-15113 tests: Add margin to 398g 60/45260/6
Patrick Farrell [Fri, 15 Oct 2021 15:17:44 +0000 (11:17 -0400)]
LU-15113 tests: Add margin to 398g

Every once in a great while, some other operation I can't
identify triggers a single write RPC to a different file
in test 398g on Gatekeeper testing.

This has nothing to do with the test itself, but does
cause it to fail occasionally.  An easy solution that
isn't too bad for the test is to add a margin of +1 RPCs
to account for this.

Only modifies sanity, so trivial is OK.

test-parameters: trivial

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I3936077cb60259653628ed26b01470ff529b0272
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/45260
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 months agoLU-12273 lod: metadata overstriping 34/35034/20
Patrick Farrell [Thu, 19 Jan 2023 20:05:38 +0000 (15:05 -0500)]
LU-12273 lod: metadata overstriping

This adds overstriping for MDTs, similar to overstriping
for OSTs (added in LU-9846).  This adds a new option to
setdirstripe, -C, allowing creation of more than one stripe
per MDT.  It is also possible to place multiple stripes on
the same MDT using specific striping with -m.

This allows a single directory to more fully use the full
capability of each MDT in the file system.

Two limitations of note:
1. This requires > 1 MDT, otherwise the DNE subsystem is
not initialized.
2. Due to recovery limitations, we allow a max of only 5
stripes per MDT.

MDT overstriping increases mdtest-hard-write performance by
up to 13%, mdtest-hard-stat by 93%, at the cost of a slight
drop in mdtest-hard-read (7%), with no change in delete.

4 MDTs, 1 stripe/MDT:
mdtest-hard-write      117.399467 kIOPS : time 339.496 seconds
mdtest-hard-stat      727.020749 kIOPS : time 55.666 seconds
mdtest-hard-read      245.556392 kIOPS : time 162.897 seconds
mdtest-hard-delete      104.379111 kIOPS : time 382.710 seconds

4 MDTs, 4 stripes/MDTs:
mdtest-hard-write      132.963290 kIOPS : time 309.093 seconds
mdtest-hard-stat     1408.161148 kIOPS : time 30.107 seconds
mdtest-hard-read      229.383910 kIOPS : time 179.576 seconds
mdtest-hard-delete      103.284369 kIOPS : time 398.442 seconds

Test-Parameters: testlist=sanity env=ONLY=300u serverversion=2.14.0
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Signed-off-by: Qian Yingjin <qian@ddn.com>
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I11556b223029820bd335e87c7bf073970e03468d
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/35034
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-16763 kunit: consolidate kernel unit testing 97/52597/6
Timothy Day [Sun, 8 Oct 2023 22:50:41 +0000 (22:50 +0000)]
LU-16763 kunit: consolidate kernel unit testing

There are several kernel modules used for different
types of unit testing. Unify them all in one place.
This will make it easier to standardize them in the
future.

Also, ensure kinode.ko is in the right place on
Ubuntu.

Test-Parameters: trivial
Test-Parameters: testlist=sanity env=ONLY=55,ONLY_REPEAT=10 clientdistro=ubuntu2204
Test-Parameters: testlist=sanity env=ONLY=55,ONLY_REPEAT=10
Test-Parameters: testlist=sanity env=ONLY=410,ONLY_REPEAT=10
Test-Parameters: testlist=sanity env=ONLY=60a,ONLY_REPEAT=10
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I16e5fc3dfb570d88c7ed817eab74511a22e91ac6
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52597
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-6142 mdd: Fix style issues for mdd_device.c 75/54075/2
Arshad Hussain [Fri, 16 Feb 2024 10:04:12 +0000 (15:34 +0530)]
LU-6142 mdd: Fix style issues for mdd_device.c

This patch fixes issues reported by checkpatch
for file lustre/mdd/mdd_device.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I623c11cb7ccd7b19407d410c2828f6fa1055f733
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54075
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-6142 mdt: Fix style issues for mdt_handler.c 62/54062/3
Arshad Hussain [Thu, 15 Feb 2024 07:45:09 +0000 (13:15 +0530)]
LU-6142 mdt: Fix style issues for mdt_handler.c

This patch fixes issues reported by checkpatch
for file lustre/mdt/mdt_handler.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: Iab06a6074c7448ba631cc8b83151253cc8b35fa2
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54062
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-6142 ptlrpc: Fix style issues for niobuf.c 61/54061/3
Arshad Hussain [Thu, 15 Feb 2024 05:53:20 +0000 (11:23 +0530)]
LU-6142 ptlrpc: Fix style issues for niobuf.c

This patch fixes issues reported by checkpatch
for file lustre/ptlrpc/niobuf.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I2b431ef591fe3e920e57ce173250e600dc3b5f1f
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54061
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 months agoLU-6142 ptlrpc: Fix style issues for sec_config.c 60/54060/2
Arshad Hussain [Thu, 15 Feb 2024 03:36:15 +0000 (09:06 +0530)]
LU-6142 ptlrpc: Fix style issues for sec_config.c

This patch fixes issues reported by checkpatch
for file lustre/ptlrpc/sec_config.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I3cdf2d900f3e4628c928ed513732c7fbc564124c
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54060
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 months agoLU-6142 ptlrpc: Fix style issues for import.c 59/54059/2
Arshad Hussain [Thu, 15 Feb 2024 05:13:28 +0000 (10:43 +0530)]
LU-6142 ptlrpc: Fix style issues for import.c

This patch fixes issues reported by checkpatch
for file lustre/ptlrpc/import.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I81aedd7fdb485932645a085a20359919f5a1b935
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54059
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 months agoLU-6142 obdclass: Fix style issues for genops.c 56/54056/3
Arshad Hussain [Thu, 15 Feb 2024 03:06:02 +0000 (08:36 +0530)]
LU-6142 obdclass: Fix style issues for genops.c

This patch fixes issues reported by checkpatch
for file lustre/obdclass/genops.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: Ifa8bc6e26e7dd3129e234d1d4626e28614419ddd
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54056
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-6142 obdclass: Fix style issues for lu_object.c 39/54039/2
Arshad Hussain [Wed, 14 Feb 2024 11:25:41 +0000 (16:55 +0530)]
LU-6142 obdclass: Fix style issues for lu_object.c

This patch fixes issues reported by checkpatch
for file lustre/obdclass/lu_object.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I4dea184d749bc79611c324b544187dc0773aed72
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54039
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-6142 obdclass: Fix style issues for cl_page.c 38/54038/2
Arshad Hussain [Wed, 14 Feb 2024 10:22:53 +0000 (15:52 +0530)]
LU-6142 obdclass: Fix style issues for cl_page.c

This patch fixes issues reported by checkpatch
for file lustre/obdclass/cl_page.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I7902663406b486e386693604e08d2709980955c7
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54038
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-17535 gss: fix lsvcgssd crash in krb lib 23/54023/5
Bruno Faccini [Tue, 13 Feb 2024 11:14:40 +0000 (12:14 +0100)]
LU-17535 gss: fix lsvcgssd crash in krb lib

This patch fixes some logic around the need to call
gss_delete_sec_context() or not vs kerberos implementations.

snd->ctx address instead of value should be passed to
serialize_context_for_kernel()/serialize_krb5_ctx() to
allow each implementation to clear it with GSS_C_NO_CONTEXT
if it has been destroyed internally, and cases where not
can also be handled in handle_krb() now.

Test-Parameters: trivial
Test-Parameters: kerberos=true testlist=sanity-krb5
Test-Parameters: testgroup=review-dne-selinux-ssk-part-2
Signed-off-by: Bruno Faccini <bfaccini@nvidia.com>
Change-Id: I752712168a2c0f0a5a7a496b851d4cddbb7e4236
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54023
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Aurelien Degremont <adegremont@nvidia.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 months agoLU-6142 llite: Fix style issues for lproc_llite.c 06/54006/2
Arshad Hussain [Mon, 12 Feb 2024 10:44:00 +0000 (16:14 +0530)]
LU-6142 llite: Fix style issues for lproc_llite.c

This patch fixes issues reported by checkpatch
for file lustre/llite/lproc_llite.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: Id4c96fa903323b73b4e1416835d8a8bb25043781
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54006
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-6142 llite: Fix style issues for files under llite 05/54005/2
Arshad Hussain [Mon, 12 Feb 2024 10:18:54 +0000 (15:48 +0530)]
LU-6142 llite: Fix style issues for files under llite

This patch fixes issues reported by checkpatch
for files:
  lustre/llite/lcommon_cl.c
  lustre/llite/vvp_object.c
  lustre/llite/xattr.c
  lustre/llite/xattr_cache.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I8c4e89b73e29b1a687e1703e721ee083457be84f
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54005
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-6142 llite: Fix style issues for vvp_dev.c 04/54004/2
Arshad Hussain [Mon, 12 Feb 2024 09:45:22 +0000 (15:15 +0530)]
LU-6142 llite: Fix style issues for vvp_dev.c

This patch fixes issues reported by checkpatch
for file lustre/llite/vvp_dev.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: Ie5b3e13e052ca8ae5ff39141473037fd782d1e30
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54004
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-6142 ldlm: Fix style issues for ldlm folder 03/54003/2
Arshad Hussain [Mon, 12 Feb 2024 06:07:38 +0000 (11:37 +0530)]
LU-6142 ldlm: Fix style issues for ldlm folder

This patch fixes issues reported by checkpatch
for files under folder lustre/ldlm/

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I3c15c6a6e3d21bce9c8609e60ec481b484f00480
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54003
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-6142 ldlm: Fix style issues for ldlm_lock.c 02/54002/2
Arshad Hussain [Sun, 11 Feb 2024 20:42:19 +0000 (02:12 +0530)]
LU-6142 ldlm: Fix style issues for ldlm_lock.c

This patch fixes issues reported by checkpatch
for file lustre/ldlm/ldlm_lock.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I492eacb0bf8033a78f1001a350c9fe4258729693
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54002
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-6142 fld: Fix style issues for fld_internal.h 00/54000/2
Arshad Hussain [Sun, 11 Feb 2024 15:13:44 +0000 (20:43 +0530)]
LU-6142 fld: Fix style issues for fld_internal.h

This patch fixes issues reported by checkpatch
for file lustre/fid/fld_internal.h

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: Id4e91c2a892015b847e9139eae357fc33644153f
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54000
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-6142 fid: Fix style issues for fid_internal.h 99/53999/2
Arshad Hussain [Sun, 11 Feb 2024 14:52:45 +0000 (20:22 +0530)]
LU-6142 fid: Fix style issues for fid_internal.h

This patch fixes issues reported by checkpatch
for file lustre/fid/fid_internal.h

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I73cf72c107879b341ff868b437dc36649083e2fd
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53999
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-6142 uapi: Fix style issues for lustre_idl.h 85/53985/3
Arshad Hussain [Fri, 9 Feb 2024 09:39:45 +0000 (15:09 +0530)]
LU-6142 uapi: Fix style issues for lustre_idl.h

This patch fixes issues reported by checkpatch
for file lustre/include/uapi/linux/lustre/lustre_idl.h

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I6031ca0dd9b0cf7b5503ff92431f391548af8f0d
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53985
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexandre Ioffe <aioffe@ddn.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-6142 misc: Fix style issues for files under lustre/include/ 67/53967/2
Arshad Hussain [Thu, 8 Feb 2024 06:50:00 +0000 (12:20 +0530)]
LU-6142 misc: Fix style issues for files under lustre/include/

This patch fixes issues reported by checkpatch
for files:
  lustre/include/lustre_linkea.h
  lustre/include/lustre_nodemap.h
  lustre/include/lustre_nrs.h
  lustre/include/lustre_osc.h
  lustre/include/lustre_quota.h
  lustre/include/lustre_scrub.h
  lustre/include/lustre_update.h

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: Ia70448d6e7f063e2edca089b66f43d0c440447a5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53967
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-6142 osp: Fix style issues for lu_object.h 52/53952/3
Arshad Hussain [Wed, 7 Feb 2024 08:32:33 +0000 (14:02 +0530)]
LU-6142 osp: Fix style issues for lu_object.h

This patch fixes issues reported by checkpatch
for file lustre/include/lu_object.h

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: Ia16c0c56e92103ef172c422f45d646d2e27b7f6a
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53952
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-6142 lustre: Fix style issues for dt_object.h 49/53949/2
Arshad Hussain [Wed, 7 Feb 2024 04:50:47 +0000 (10:20 +0530)]
LU-6142 lustre: Fix style issues for dt_object.h

This patch fixes issues reported by checkpatch
for file lustre/include/dt_object.h

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I8f8df933cea0b9bfadf6fff130bcfca3f862242c
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53949
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Alexandre Ioffe <aioffe@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-6142 ptlrpc: Fix style issues for lustre_net.h 38/53938/2
Arshad Hussain [Tue, 6 Feb 2024 09:19:19 +0000 (14:49 +0530)]
LU-6142 ptlrpc: Fix style issues for lustre_net.h

This patch fixes issues reported by checkpatch
for file lustre/include/lustre_net.h

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: Ifd0a6d41657033ba708adaa918a0fbed5080fa7b
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53938
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-6142 ldlm: Fix style issues for lustre_dlm.h 18/53918/5
Arshad Hussain [Mon, 5 Feb 2024 07:11:54 +0000 (12:41 +0530)]
LU-6142 ldlm: Fix style issues for lustre_dlm.h

This patch fixes issues reported by checkpatch
for file lustre/include/lustre_dlm.h

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I38ed69a093786157ff3ae16670a3c6f9125f13ee
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53918
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-6142 lustre: Fix style issues for lustre_export.h 16/53916/3
Arshad Hussain [Mon, 5 Feb 2024 07:38:32 +0000 (13:08 +0530)]
LU-6142 lustre: Fix style issues for lustre_export.h

This patch fixes issues reported by checkpatch
for file lustre/include/lustre_export.h

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I8a55aaad0702773ad83f4d7f7798d5509c086ba8
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53916
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-6142 obdclass: Fix style issues for lustre_idmap.h 13/53913/2
Arshad Hussain [Mon, 5 Feb 2024 10:45:01 +0000 (16:15 +0530)]
LU-6142 obdclass: Fix style issues for lustre_idmap.h

This patch fixes issues reported by checkpatch
for file lustre/include/lustre_idmap.h

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I975d7f719bb2841db93c6b9cda530e02984d9ca3
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53913
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-6142 misc: fix style issues in uapi headers 99/53899/3
Arshad Hussain [Sat, 3 Feb 2024 17:42:05 +0000 (23:12 +0530)]
LU-6142 misc: fix style issues in uapi headers

This patch fixes issues reported by checkpatch
for all files under folder lustre/include/uapi/linux/lustre/

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I808bfd5f91d9b9b0cbb019206d4ff306702a183c
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53899
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-17469 llite: hold object reference in IO 19/53819/3
Bobi Jam [Thu, 25 Jan 2024 11:20:27 +0000 (19:20 +0800)]
LU-17469 llite: hold object reference in IO

There could be a race between page write and inode free, hold
a cl_object reference during the IO lest accessing freed object.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Ic70cc27430e68265aba0662fc68e9bfe2f86cfe1
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53819
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-17468 lod: component add missed pattern info 17/53817/3
Bobi Jam [Thu, 25 Jan 2024 03:56:42 +0000 (11:56 +0800)]
LU-17468 lod: component add missed pattern info

"lfs setstripe --commponent-add" missed setting component pattern,
which causes some setting missing, like overstriping.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I7ad746a550f1afea54a6f5b68823a79a85a44082
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53817
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 months agoLU-9680 lnet: Convert net_fault.c to work with large NIDs 31/53731/4
Chris Horn [Thu, 8 Feb 2024 16:25:51 +0000 (11:25 -0500)]
LU-9680 lnet: Convert net_fault.c to work with large NIDs

Modify the lnet fault injection to handle large NIDs.

Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I0d57d3bf562444250b10fd83437107e2e3fe5a1b
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53731
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-13802 llite: add hybrid IO SBI flag 92/52592/17
Patrick Farrell [Tue, 24 Oct 2023 18:37:55 +0000 (14:37 -0400)]
LU-13802 llite: add hybrid IO SBI flag

Add an SBI flag so hybrid IO can be fully disabled.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I2825b4cf261f98d71a18cd66d6fe3632dfabc37a
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52592
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
2 months agoLU-13802 llite: tag switched hybrid IOs 03/52703/6
Patrick Farrell [Tue, 24 Oct 2023 18:36:17 +0000 (14:36 -0400)]
LU-13802 llite: tag switched hybrid IOs

If we switched IO type with hybrid IO, tag the IO in the
cl_io.  This will be used to make various choices later.

Also add a more verbose debug message for DIO, printing
various aspects of the IO.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I347ef059eadcd9fd3767d7defc2e3da0eeb5573b
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52703
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-13802 llite: trivial bio_dio switch check 86/52586/13
Qian Yingjin [Fri, 6 Oct 2023 19:33:32 +0000 (15:33 -0400)]
LU-13802 llite: trivial bio_dio switch check

This adds a trivial version of the DIO BIO switch checking
function which doesn't ever switch.  This creates the basic
check function which we'll add to in future patches.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: Ia01df8d0f33246d3833c5327bcb1a07ac305492b
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52586
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 months agoLU-13802 llite: refactor ll_file_io_generic decs 87/52587/11
Patrick Farrell [Fri, 6 Oct 2023 19:39:41 +0000 (15:39 -0400)]
LU-13802 llite: refactor ll_file_io_generic decs

The variable declarations in ll_file_io_generic are in no
order at all.  Put them in the standard order and convert
a few 'unsigned int' to bool.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I0b808ab82bdc129853dd4f27b93b3c91b201ca8a
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52587
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
2 months agoLU-13814 osc: skip lru_add for transient pages 70/52070/12
Patrick Farrell [Wed, 23 Aug 2023 18:53:53 +0000 (14:53 -0400)]
LU-13814 osc: skip lru_add for transient pages

Transient pages do not go in the LRU, so don't bother
trying to add them.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I90e3cca2229e1ae7d769c0534b5b6e0be2357ad9
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52070
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 months agoLU-13814 llite: refactor ll_direct_rw_pages 99/52399/4
Patrick Farrell [Sun, 17 Sep 2023 17:57:15 +0000 (13:57 -0400)]
LU-13814 llite: refactor ll_direct_rw_pages

ll_direct_rw_pages has some oddities in the control flow,
which make it a little harder to understand.  Clean those
up so it's easier to modify.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I64b4639df948556da03824a71b4b30806deced0d
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52399
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
2 months agoLU-13805 llite: make page_list_{add,del} symmetric 57/52057/7
Patrick Farrell [Sun, 17 Sep 2023 18:05:33 +0000 (14:05 -0400)]
LU-13805 llite: make page_list_{add,del} symmetric

An earlier patch created the slightly frightening situation
where we use cl_page_list_del to remove references which
were not taken by cl_page_list_add.

This assymetry is scary, so let's not do it.  Instead, DIO
now explicitly puts the only cl_page reference it takes.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I832d8ca7dc7f2f99dc30f972197bebc83b8b5977
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52057
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
2 months agoLU-9839 clio: lov active ios accounting fix 38/51638/10
Alexander Zarochentsev [Tue, 21 Nov 2023 14:46:44 +0000 (09:46 -0500)]
LU-9839 clio: lov active ios accounting fix

ASSERT(atomic_read(&lov->lo_active_ios)==0) is triggered due to a
bug in active_ios accounting. For some cl_io_init(,CIT_MISC,,)
calls increment the lov_active_ios counter is not protected by the
layout lock. So the checks for active_ios != 0 are racy and not
preventing another thread from starting new cl_io and incrementing
the active_ios counter after any check but before the assertion.

The lov_active_ios counter increment should be done under the
same condition as taking the layout type lock.
The ci_type=CIT_MISC and ci_ignore_layout=1 should not be used
in ll_dom_finish_open() as the I/O doesn't come
"from the osc layer" and may race with a layout change.

HPE-bug-id: LUS-11628
Signed-off-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Change-Id: I35fda85b968b847a87e73dd36bbb1648c744d62c
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51638
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Vitaly Fertman <vitaly.fertman@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-16356 hsm: store crh in rhashtable instead of list 84/49284/28
Sergey Cheremencev [Thu, 31 Aug 2023 16:12:51 +0000 (18:12 +0200)]
LU-16356 hsm: store crh in rhashtable instead of list

Store coordinator restore handles in rhashtable instead of list.
Search in a list with above a million entries takes too much time
causing to wait a lot of tasks due to contention on cdt_restore_lock.
As cdt_restore_lock is not needed anymore to protect
cdt_restore_handle_list, this patch also solves the problem with
parallel restore requests(LU-15132).

Add regression test sanity-hsm 409b.

Fixes: 66b3e74bc ("LU-15132 hsm: Protect against parallel HSM restore requests")
Test-Parameters: testlist=sanity-hsm env=ONLY=409b,ONLY_REPEAT=20
HPE-bug-id: LUS-11055
Change-Id: I3bb8788f6a0ce4c3fe4a3be85804df1c6845c313
Signed-off-by: Sergey Cheremencev <scherementsev@ddn.com>
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Signed-off-by: Nikitas Angelinas <nikitas.angelinas@hpe.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49284
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 months agoLU-16096 target: use lsd_reply_data_v1 format by default 36/50636/9
Qian Yingjin [Fri, 14 Apr 2023 08:43:18 +0000 (04:43 -0400)]
LU-16096 target: use lsd_reply_data_v1 format by default

Since it does not actually need the lrd_batch_idx field in the
reply data for the read-only batched RPC such as statahead, this
means that lsd_reply_data_v2 format can be only enabled after the
update batched RPC such as MetaWBC is introduced.

In this patch, we use lsd_reply_data_v1 format and read/write
"REPLY_DATA" in old format by default.

Test-Parameters: testlist=replay-dual env=PTLDEBUG=-1,ONLY=3
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I237e719d3a8d3ff1377df8194fca00b25694273b
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50636
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 months agoLU-17418 libcfs: support debug setup for libcfs modules 25/53825/7
James Simmons [Mon, 5 Feb 2024 02:53:44 +0000 (21:53 -0500)]
LU-17418 libcfs: support debug setup for libcfs modules

Work was landed to make Lustre ensure key libcfs components
were initialized for both a module build and a build directly
into the kernel. This change resulted in an defect that allows
you to crash a node when you only load libcfs.ko and run a
user land tool to set a debugfs setting of libcfs. The debug
handling is critical to load before anything. Update Lustre
to handle both a module and builtin setup for Lustre. When
lustre is built into the kernel we can't control if libcfs_init()
is called first so have libcfs_setup() handle setting up the
debug handling. When built as a module have libcfs_init()
setup the debug handling instead. For both cases
libcfs_debug_init() is always called so make sure we only
initialize it only once. Add a test to validate this fix.

Fixes: f3494a6e9 ("LU-9859 libcfs: refactor libcfs initialization.")
Test-Parameters: trivial testlist=conf-sanity env=ONLY="5j"
Change-Id: If4a229e43b9e06a723546c03eb2b787ba0b16f5a
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53825
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoNew tag 2.15.61 2.15.61 v2_15_61
Oleg Drokin [Sat, 17 Feb 2024 07:29:48 +0000 (02:29 -0500)]
New tag 2.15.61

Change-Id: I2df53b16d604cc066e9118f4e404a649e177e7fd
Signed-off-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-17413 llite: protect check in ll_merge_md_attr() 39/53639/2
Alex Zhuravlev [Wed, 10 Jan 2024 19:09:18 +0000 (22:09 +0300)]
LU-17413 llite: protect check in ll_merge_md_attr()

striping can apply in a concurrent process, so the check for striping
should be serialized against any concurrent process.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Iffac2f1f9b53abc26705d70a30c2201b48156ac8
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53639
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 months agoLU-17498 tests: show NIDs in node summary page 00/52500/4
Andreas Dilger [Mon, 25 Sep 2023 17:53:18 +0000 (11:53 -0600)]
LU-17498 tests: show NIDs in node summary page

Instead of only showting the network type for each node, list
show the full NID in the YAML file to help with debugging and
identifying nodes in the logs.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I7ee39b08c5cae5a3f9ee4ea4dbee001a6d889fbb
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52500
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Lee Ochoa <lochoa@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Alex Deiter
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 months agoLU-17287 tests: remove trap 0 27/53127/4
Alex Zhuravlev [Tue, 14 Nov 2023 05:53:00 +0000 (08:53 +0300)]
LU-17287 tests: remove trap 0

.. from destroy_test_pools() as this interrupts current trap
chain making stack_trap useless.

Test-Parameters: trivial
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: If978389a140f21ac520ef21b505378b8f64d8f73
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53127
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 months agoLU-16296 tests: sanity-flr/36c to save on writes 25/49025/8
Alex Zhuravlev [Thu, 3 Nov 2022 09:36:40 +0000 (12:36 +0300)]
LU-16296 tests: sanity-flr/36c to save on writes

there is no need to write 600MB as this may take significant
time if used with HDD.

Test-Parameters: trivial testlist=sanity-flr
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Ic6001aaba7f349a14ade1c720d175430370dd7e9
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49025
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 months agoLU-11990 tests: enable conf-sanity 66 77/53877/3
Alexander Boyko [Mon, 15 Jan 2024 16:30:23 +0000 (11:30 -0500)]
LU-11990 tests: enable conf-sanity 66

The test was skipped from running beacuse it produces fails
for alone MGS. Since LU-13356 it is fixed, add it to running.

Test-Parameters: trivial testlist=conf-sanity
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: Idb684bb2780832f089fba1441d3b9375e9740431
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53877
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-17495 build: cleanup configure messages 74/53874/2
Shaun Tancheff [Thu, 1 Feb 2024 07:24:48 +0000 (14:24 +0700)]
LU-17495 build: cleanup configure messages

Convert some remaining configure checks to use
  LB2_MSG_LINUX_TEST_RESULT

Also drop the undefined macro LC_CONFIG_HEALTH_CHECK_WRITE

Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: If0ae4f7549d5e1a46d6a5ce99d40ebcbd76c5e85
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53874
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-17486 ldiskfs: fix race in ext4_destroy_inode 68/53868/2
Alex Zhuravlev [Wed, 31 Jan 2024 05:16:12 +0000 (08:16 +0300)]
LU-17486 ldiskfs: fix race in ext4_destroy_inode

ext4_i_callback() can race with the access to i_reserved_data_blocks
in ext4_destroy_inode() when used with preemption-enabled kernel.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I69c6bcfbb24e6c07d28ebcd2bdd9d9e6f06ec8d1
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53868
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-17475 tests: Do not pass IP to do_node in wait_nm_sync 38/53838/2
Chris Horn [Sat, 13 Jan 2024 17:06:10 +0000 (11:06 -0600)]
LU-17475 tests: Do not pass IP to do_node in wait_nm_sync

If do_node() resolves to pdsh then the ':' in an IPv6 NID is
misinterpreted as specifying an rcmd module. Avoid the issue by
passing the node hostname instead of IP.

Test-Parameters: trivial
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I511308e3fb5247a85dec7f20a0ff4f3da2de4f3a
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53838
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 months agoLU-17474 tests: Update sanity 215 for IPv6 36/53836/2
Chris Horn [Sat, 13 Jan 2024 04:16:29 +0000 (22:16 -0600)]
LU-17474 tests: Update sanity 215 for IPv6

Update regexes to handle IPv6 NIDs.

Test-Parameters: trivial
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Ie8e8cba0294ac241fddeb5af9c75799d67bb6638
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53836
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-17467 build: Expand CUDA source detection logic 32/53832/2
Jean-Baptiste Skutnik [Thu, 25 Jan 2024 18:52:26 +0000 (21:52 +0300)]
LU-17467 build: Expand CUDA source detection logic

Fix the configure logic not handling the package disabling (variable
set to 'no') for the CUDA and GDS source paths

Signed-off-by: Jean-Baptiste Skutnik <jb.skutnik@gmail.com>
Change-Id: Icb96274a6df2508f8e3010daef0ba1d17b4471dc
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53832
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-17471 osd: add symlink for brw_stats 29/53829/10
Hongchao Zhang [Fri, 26 Jan 2024 13:43:36 +0000 (21:43 +0800)]
LU-17471 osd: add symlink for brw_stats

Add symlink at /proc/fs/lustre/osd-*/*/brw_stats to
/sys/kernel/debug/lustre/osd-*/*/brw_stats to fix
the compatible issue of the previous utils that are
still using the old proc entry.

Test-Parameters: testlist=sanity env=ONLY=0f serverversion=2.15.4
Fixes: 8a84c7f9c7d6 ("LU-14927 osd: share brw_stats code between OSD back ends.")
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Change-Id: Ie86b2b384e3b91f98ead00b6325ddeb020e47aa5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53829
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-17465 nodemap: change squash default value to 65534 02/53802/3
Sebastien Buisson [Tue, 23 Jan 2024 09:07:25 +0000 (10:07 +0100)]
LU-17465 nodemap: change squash default value to 65534

Initially, default values for nodemap.squash_uid/gid/projid were set
to 99, to match user 'nobody'. But on newer systems, nobody has
changed to 65534 and 99 no longer exists.
It is safe to use 65534 in all cases, as even on older systems it
exists and corresponds to 'nfsnobody'.

Test-Parameters: testlist=sanity env=ONLY=432 serverversion=2.15
Test-Parameters: testlist=sanity env=ONLY=432 clientversion=2.15
Test-Parameters: testlist=sanity-quota env=ONLY=75 serverversion=2.15
Test-Parameters: testlist=sanity-quota env=ONLY=75 clientversion=2.15
Test-Parameters: testlist=sanity-selinux env=ONLY=21 serverversion=2.15
Test-Parameters: testlist=sanity-selinux env=ONLY=21 clientversion=2.15
Test-Parameters: testlist=sanity-sec env=ONLY="7 8 9 10 11 12 13 14 15 16 18 19 20 21 22 23 24 25 26 27 32 33 34 35 36 55 61 64" serverversion=2.15
Test-Parameters: testlist=sanity-sec env=ONLY="7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 32 33 34 35 36 55 61 64" clientversion=2.15
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I2e20fda0fdc0d5bfdf964a890bfbd0b54b943cf4
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53802
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
3 months agoLU-17459 lod: incorrect assert in lod_statfs_and_check() 83/53783/3
Alex Zhuravlev [Tue, 23 Jan 2024 17:02:14 +0000 (20:02 +0300)]
LU-17459 lod: incorrect assert in lod_statfs_and_check()

the assertion must be done once we're sure this target
has not been counted/marked as active.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I56ae3fad92b8518f6aba2c880ecdac55f53cb689
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53783
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Reviewed-by: Sergey Cheremencev <scherementsev@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 months agoLU-17216 ofd: skip sanity/70a on old OSTs 70/53770/4
Timothy Day [Tue, 23 Jan 2024 03:33:27 +0000 (03:33 +0000)]
LU-17216 ofd: skip sanity/70a on old OSTs

OSTs older than 2.15.59 won't have enable_health_write.
So skip the sanity/70a that requires it.

Test-Parameters: trivial
Test-Parameters: testlist=sanity clientversion=2.15 env=ONLY=70a,ONLY_REPEAT=10
Test-Parameters: testlist=sanity serverversion=2.15 env=ONLY=70a,ONLY_REPEAT=10
Fixes: e383791 ("LU-17216 ofd: make enable_health_write tunable")
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I320f6911e7b7064d49761a022c462b7c20f3a2e1
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53770
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Deiter
3 months agoLU-17452 tests: fix interop sanityn tests with b2_15 59/53759/3
Etienne AUJAMES [Mon, 22 Jan 2024 10:44:11 +0000 (11:44 +0100)]
LU-17452 tests: fix interop sanityn tests with b2_15

sanityn 77q and 77r require server fixes to pass.
The patch adds server version check in tests.

Fixes: 44cc782 ("LU-9859 ptlrpc: simplifying expression parsing in nrs_tbf")
Fixes: c098c09 ("LU-14976 nrs: change nrs policies at run time")
Test-Parameters: trivial
Test-Parameters: clientversion=2.15.4 testlist=sanityn
Test-Parameters: serverversion=2.15.4 testlist=sanityn
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Change-Id: I91b30284e9a3c24c9709215f509ca75923214c5b
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53759
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Deiter
3 months agoLU-8191 llverfs: fix non-static functions 54/53754/5
Timothy Day [Mon, 22 Jan 2024 02:51:49 +0000 (02:51 +0000)]
LU-8191 llverfs: fix non-static functions

Static analysis shows that a number of functions
could be made static. This patch declares several
functions in llverfs.c static.

Making functions new_file() and new_dir() static
causes new format truncation errors. Check the
return of snprintf() to silence these.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Ieccf1e40c1da627571a7a95adbb85599185f1342
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53754
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-17438 utils: fix build for wirecheck 16/53716/3
Etienne AUJAMES [Wed, 17 Jan 2024 16:51:58 +0000 (17:51 +0100)]
LU-17438 utils: fix build for wirecheck

Fix wirecheck compilation and regenerate wiretest files.

Fixes: 6a20bdc ("LU-11376 lov: new foreign LOV format")
Fixes: 15d44e7 ("LU-12682 llite: fake symlink type of foreign file/dir")
Fixes: aebb405 ("LU-10499 pcc: use foreign layout for PCCRO on server side")
Fixes: 0ea23e0 ("LU-13307 nodemap: have nodemap_add_member support large NIDs")
Test-Parameters: trivial testlist=sanity env=ONLY=58
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Change-Id: I3a312136da00ba726887660575f6558faf167241
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53716
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-17421 build: Update check for arc_prune_func_t parameters 64/53664/6
Brian Atkinson [Fri, 12 Jan 2024 00:36:59 +0000 (17:36 -0700)]
LU-17421 build: Update check for arc_prune_func_t parameters

In OpenZFS 2.2.1 the code for arc_prune_async() was unified so that
FreeBSD and Linux did not have their own implementation versions of
the same code. Part of this update changed first parameter for the
arc_prune_func_t to be an uint64_t.

Without this patch, Lustre would not build with ZFS 2.2.1 because of
a failure for incompatible pointer types for the arc_prunte_func_t
function pointer passed to arc_add_prune_callback().

Test-Parameters: trivial
Signed-off-by: Brian Atkinson <batkinson@lanl.gov>
Change-Id: Iaa03cc9421f27a8517ce04817f04102de9adb86a
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53664
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Akash B <akash-b@hpe.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
3 months agoLU-16913 quota: notify newest lqe in qmt_set_id_notify 37/53637/2
Sergey Cheremencev [Wed, 10 Jan 2024 18:56:03 +0000 (21:56 +0300)]
LU-16913 quota: notify newest lqe in qmt_set_id_notify

It is possible that lqe_locate may call lqe_find inside
qmt_pool_lqes_lookup_spec and insert the 2nd lqe into
lqs_hash during processing the previous one. Do not add the
1st lqe to be processed by qmt_reba_thread in qmt_id_lock_notify,
as this lqe will be freed in the end of lqe_locate_find due
to the race with the 2nd that is already exist in lqs_hash.
This fix should potentially fix the following assertion:

  (qmt_lock.c:950:qmt_id_lock_glimpse()) ASSERTION( lqe->lqe_gl ) failed:
  (qmt_lock.c:950:qmt_id_lock_glimpse()) LBUG

Fixes: 09f9fb3211 ("LU-11023 quota: quota pools for OSTs")
Signed-off-by: Sergey Cheremencev <scherementsev@ddn.com>
Change-Id: I3a3114d880077c87e61fccf4f32e3845bd42d842
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53637
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-15496 tests: add debugging to sanity/398c 62/53462/6
Andreas Dilger [Thu, 14 Dec 2023 14:23:12 +0000 (07:23 -0700)]
LU-15496 tests: add debugging to sanity/398c

Dump the rpc_stats to help understand why the test is failing.

Test-Parameters: trivial testlist=sanity clientarch=ppc64le env=ONLY=398c,ONLY_REPEAT=100
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I5ed1b7133eddd242b234a05a670e152e4ca359b7
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53462
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Alex Deiter
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-17276 ldlm: add interval in flock 47/53447/11
Yang Sheng [Wed, 13 Dec 2023 20:30:36 +0000 (04:30 +0800)]
LU-17276 ldlm: add interval in flock

Add necessary changes for using interval tree in flock.

Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: I94c416b4215b863b54eccfe7025f2976fe40181a
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53447
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-17334 lmv: handle object created on newly added MDT 63/53363/6
Lai Siyao [Thu, 7 Dec 2023 12:39:09 +0000 (07:39 -0500)]
LU-17334 lmv: handle object created on newly added MDT

When a new MDT is added to a filesystem without no_create, then a new
object is created on the MDT relatively quickly after it is added to
the filesystem, in particular because the new MDT would be preferred
by QOS space balancing due to lots of free space. However, it might
take a few seconds for the addition of the new MDT to be propagated
across all of the clients, so there is a risk that one client creates
a directory on an MDT that a client is not yet aware of, which returns
an error to the application immediately.

This patch fixes the issue by adding lmv_tgt_retry() that will retry
to use the MDT and wait for some number of seconds for the filesystem
layout to be updated if the MDT index an existing file/directory is
not found.

Commands that depend on user input, like 'lfs mkdir -i' and 'lfs df'
and round-robin MDT allocation will continue to use lmv_tgt() which
doesn't retry in case user specifies wrong MDT index, otherwise it can
hang the command for an extended period of time.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Idb0cf65e95f665628d6799298732b7a06cde4a86
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53363
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-17297 grant: move tgt_grant_sanity_check() calls 71/53171/4
Vladimir Saveliev [Fri, 17 Nov 2023 15:30:06 +0000 (18:30 +0300)]
LU-17297 grant: move tgt_grant_sanity_check() calls

Call tgt_grant_sanity_check() in ofd_obd_disconnect() and in
mdt_obd_disconnect() after call to tgt_grant_discard().

Otherwise, sum of grants does not match to total grant counter which
is reported as LustreError:
    ofd_obd_disconnect: tot_granted 0 != fo_tot_granted 8388608

This is because on stale export eviction
class_disconnect_stale_exports() moves stale exports to separate list
but does not update obd's grant counters.

Test to illustrate the issue is included.

HPE-bug-id: LUS-11469
Signed-off-by: Vladimir Saveliev <vladimir.saveliev@hpe.com>
Change-Id: I0b4568b88a2fe7b50f4eac50b4b064d7afbc7a75
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53171
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 months agoLU-17271 kfilnd: Allocate tn_mr_key before kfilnd_peer 29/53029/4
Chris Horn [Tue, 7 Nov 2023 22:19:26 +0000 (15:19 -0700)]
LU-17271 kfilnd: Allocate tn_mr_key before kfilnd_peer

A race exists between kfilnd_peer and tn_mr_key allocation that could
result in RKEY re-use and data corruption.

Thread 1: Posts tagged receive with RKEY based on
          peerA::kp_local_session_key X and tn_mr_key Y
Thread 2: Fetches peerA with kp_local_session_key X
Thread 1: Cancels tagged receive, marks peerA for removal, and
          releases tn_mr_key Y
Thread 2: allocates tn_mr_key Y
At this point, thread 2 has the same RKEY used by thread 1.

The fix is to always allocate the tn_mr_key before looking up the
peer, and always mark peers for removal before releasing tn_mr_key.
This commit modifies the TN allocation to ensure the tn_mr_key is
allocated before looking up the target peer.

HPE-bug-id: LUS-11972
Test-Parameters: trivial
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I2e0948ae4fe7c5dfb86e297a3437213f193bf67c
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53029
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Ron Gredvig <ron.gredvig@hpe.com>
Reviewed-by: Ian Ziemba <ian.ziemba@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-17271 kfilnd: Protect RKEY for bulk Put/Get 28/53028/3
Chris Horn [Tue, 7 Nov 2023 21:14:42 +0000 (14:14 -0700)]
LU-17271 kfilnd: Protect RKEY for bulk Put/Get

The initiator of a bulk Put/Get generates an RKEY based on the the
values of the struct kfilnd_tn::tn_mr_key and
struct kfilnd_peer::kp_local_session_key. kp_local_session_key is
assigned at peer creation, and tn_mr_key is assigned when the
kfilnd_tn is allocated.

A bulk Put/Get can fail in various ways such that the target of the
operation may have a reference to the RKEY, but the originator cannot
know the state of the operation at the target. In these cases, the
initiator must ensure that the RKEY is not re-used. To accomplish
this, we need to delete the target peer from the originator's peer
cache to ensure that subsequent bulk Put/Get operations will use
a new kp_local_session_key, and thus avoid re-using any old RKEY
values.

HPE-bug-id: LUS-11972
Test-Parameters: trivial
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: If270a2df745ee88c35addc8194cdb160cb373c3e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53028
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Ron Gredvig <ron.gredvig@hpe.com>
Reviewed-by: Ian Ziemba <ian.ziemba@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-17270 kfilnd: Check status of TAG_RX_OK in WAIT_COMP 27/53027/2
Chris Horn [Tue, 7 Nov 2023 17:36:29 +0000 (10:36 -0700)]
LU-17270 kfilnd: Check status of TAG_RX_OK in WAIT_COMP

When the target of a bulk Get/Put drops the message it sends
ENODATA back to the initiator via immediate data. This status needs to
be accounted for while the transaction is in the TN_STATE_WAIT_COMP
state, otherwise it can be lost if the TN_EVENT_TAG_RX_OK event
arrives before the TN_EVENT_TX_OK event.

HPE-bug-id: LUS-11971
Test-Parameters: trivial
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I52d6ea52746cbc14a86478fcccb32b25badd3b0a
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53027
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Ron Gredvig <ron.gredvig@hpe.com>
Reviewed-by: Ian Ziemba <ian.ziemba@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-16822 tests: Skip tests lacking large NID support 27/53727/4
Chris Horn [Tue, 5 Dec 2023 04:25:01 +0000 (22:25 -0600)]
LU-16822 tests: Skip tests lacking large NID support

Test 230 - Needs lctl conn_list but this does not support large NIDs.

Tests 204-207,209-213,216,218,231,302,500 - These test cases use
commands that do not support large NIDs (drop rules, printing recovery
queue, etc.), or do not work properly (lnetctl import/export,
lctl which_nid).

Tests 101,103 - These tests exercise NID ranges, so they do not
work with large NIDs.

Tests 100,102,105-106 - These test cases need to be re-written to
specify valid IPv6 NIDs.

Test 220 - Calls lst which does not support large NIDs.

Test 250 - Uses ksocklnd-config but this does not support IPv6.

Test 208 and 255 use ip2nets and routes parameters that do not support
large NIDs.

Test 214 - If the destination NID to the ping commands is IPv6, then
the fake interface cannot be cleaned up.

Some places where drop or delay rules were added did not check for
success or failure. This has been corrected.

Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I7f251a82aa2eee304419a765df728a014b9c9e27
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53727
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-16967 build: Separate lnet LND deb packaging 97/52397/6
Shaun Tancheff [Fri, 26 Jan 2024 17:57:35 +0000 (00:57 +0700)]
LU-16967 build: Separate lnet LND deb packaging

Enable separate packaging of lnet lnd kernel modules into
separate packages with build profile multiple-lnds:

  lustre-lnet-module-socklnd for socklnd.ko
  lustre-lnet-module-gnilnd for kgnilnd.ko, profile gnilnd
  lustre-lnet-module-kfilnd for kkfilnd.ko, profile kfilnd
  lustre-lnet-module-o2iblnd for o2iblnd.ko, profile ext_o2ib
  lustre-lnet-module-in-kernel-o2iblnd for ko2iblnd.ko,
     profile int_o2ib

Test-Parameters: trivial
HPE-bug-id: LUS-11711
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I3a5ca03fa410238f66083289db0899c8b4bfab5c
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52397
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-16498 obdclass: change uc_lock to rwlock 95/52395/13
Sebastien Buisson [Thu, 14 Sep 2023 16:00:04 +0000 (18:00 +0200)]
LU-16498 obdclass: change uc_lock to rwlock

Change the upcall cache uc_lock to a read-write lock so that threads
can get the read lock to do concurrent lookups in the upcall cache,
and only grab the write lock in the rare case when a new entry is
added or old entries are expired. That reduces serialization between
server threads during normal operation, and avoids all of the threads
spinning for some time if the requested key (UID or gss context) is
not in the cache at all, before they sleep.

Test-Parameters: kerberos=true testlist=sanity-krb5
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I812400104fd2115d19386fb4a03bb3ce99c49383
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52395
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Aurelien Degremont <adegremont@nvidia.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-16314 debug: Enable optional unhashed pointers 77/51877/15
Shaun Tancheff [Wed, 18 Oct 2023 09:27:10 +0000 (04:27 -0500)]
LU-16314 debug: Enable optional unhashed pointers

This patch takes a page out of the kernel trace debug
playbook to rewrite format strings and change %p -> %px
on-the-fly when:

   libcfs_debug_raw_pointers

is enabled.

The module parameter can be viewed and modified by root
via lctl:
    lctl get_param debug_raw_pointers
    lctl set_param debug_raw_pointers=1

Since nothing uses the return value from libcfs_debug_msg
change it to void.

Use percpu pre-allocated buffers for holding modified
format strings to avoid kmalloc/kfree as well as avoid
bloating stack usage.

HPE-bug-id: LUS-10945
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I63d90d614ce4435b07f5e84991a12ae7351ac2bb
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51877
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
3 months agoLU-16314 lnet: Migrate LASSERTF %p to %px 31/51231/7
Shaun Tancheff [Tue, 6 Jun 2023 04:07:44 +0000 (11:07 +0700)]
LU-16314 lnet: Migrate LASSERTF %p to %px

This change covers libcfs and lnet and converts LASSERTF
statements to explicitly use %px.

Use %px to explicitly report the non-hashed pointer value
messages printed when a kernel panic is imminent. When
analyzing a crash dump the associated kernel address can
be used to determine the system state that lead to the
system crash.

As crash dumps can and are provided by customers from
production systems the use of the kernel command line
parameter:
    no_hash_pointers
is not always possible.

Ref: Documentation/core-api/printk-formats.rst

Test-Parameters: trivial
HPE-bug-id: LUS-10945
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I4d0c956e1b914cea9517b632d46f1714bcd43a85
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51231
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 months agoLU-16314 llite: Migrate LASSERTF %p to %px 13/51213/8
Shaun Tancheff [Tue, 6 Jun 2023 03:44:53 +0000 (10:44 +0700)]
LU-16314 llite: Migrate LASSERTF %p to %px

This change covers lustre/ec through lustre/mgs and
converts LASSERTF statements to explicitly use %px.

Use %px to explicitly report the non-hashed pointer value
messages printed when a kernel panic is imminent. When
analyzing a crash dump the associated kernel address can
be used to determine the system state that lead to the
system crash.

As crash dumps can and are provided by customers from
production systems the use of the kernel command line
parameter:
    no_hash_pointers
is not always possible.

Ref: Documentation/core-api/printk-formats.rst

Test-Parameters: trivial
HPE-bug-id: LUS-10945
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I708d9ef60c63f5b4006c7986599a2f39fc9e5fdf
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51213
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-16314 obdclass: Migrate LASSERTF %p to %px 05/49405/10
Shaun Tancheff [Thu, 25 May 2023 12:01:32 +0000 (07:01 -0500)]
LU-16314 obdclass: Migrate LASSERTF %p to %px

This change covers lustre/obdclass through lustre/target and
converts LASSERTF statements to explicitly use %px.

Use %px to explicitly report the non-hashed pointer value
messages printed when a kernel panic is imminent. When
analyzing a crash dump the associated kernel address can
be used to determine the system state that lead to the
system crash.

As crash dumps can and are provided by customers from
production systems the use of the kernel command line
parameter:
    no_hash_pointers
is not always possible.

Ref: Documentation/core-api/printk-formats.rst

Test-Parameters: trivial
HPE-bug-id: LUS-10945
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Ia256dc1f74f976640ec82746a5d761ef662f45ae
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49405
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-17426 tests: add crossdir parallel rename test 38/53738/12
Andreas Dilger [Fri, 19 Jan 2024 03:44:33 +0000 (20:44 -0700)]
LU-17426 tests: add crossdir parallel rename test

Add sanityn test_81d to test cross-dir (same-MDT) parallel rename
if the MDT supports this functionality.

Test-Parameters: trivial testlist=sanityn
Test-Parameters: testlist=sanityn serverversion=2.15 env=SANITYN_EXCEPT="77q 77r"
Test-Parameters: testlist=sanityn env=ONLY=81d,ONLY_REPEAT=10
Test-Parameters: testlist=sanityn env=ONLY=81d,ONLY_REPEAT=10 mdtcount=2
Test-Parameters: testlist=sanityn env=ONLY=81d,ONLY_REPEAT=10 mdtcount=4
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ic8717e6865a9c6c9698186f4fdf34c1f4f74083f
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53738
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>