Whamcloud - gitweb
fs/lustre-release.git
18 hours agoLU-17662 osd-zfs: Support for ZFS 2.2.3 30/54530/9 master
Shaun Tancheff [Mon, 6 May 2024 03:06:31 +0000 (10:06 +0700)]
LU-17662 osd-zfs: Support for ZFS 2.2.3

ZFS commit zfs-2.2.99-269-g9b1677fb5
   dmu: Allow buffer fills to fail
Adds a boolean_t to dmu_buf_will_fill() and dmu_buf_fill_done()

Lustre always uses B_FALSE for this argument.

Also re-arrange and split some configure macros so we can all
the zfs and ldiskfs tests can be run in the same parallel pass.

Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I71a4723bfa8ce62ae6f270e26ab149bf98278d3f
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54530
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
18 hours agoLU-17477 tests: conf-sanity/48 with debug=0 99/53799/13
Alex Zhuravlev [Wed, 24 Jan 2024 07:52:20 +0000 (10:52 +0300)]
LU-17477 tests: conf-sanity/48 with debug=0

conf-sanity/48 takes quite long setting 4,5K ACLs.
debug=0 improves this significantly.

Test-Parameters: trivial testlist=conf-sanity env=ONLY=48
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Ifa39b9efc80b41050a13323474dd19b865cc6273
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53799
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
18 hours agoLU-16741 fid: rename ptlrpc_req_finished for component fid 94/54994/2
Arshad Hussain [Thu, 2 May 2024 11:28:21 +0000 (07:28 -0400)]
LU-16741 fid: rename ptlrpc_req_finished for component fid

Patch renames ptlrpc_req_finished to ptlrpc_req_put for
fid component

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: If5bf08719ab9be8255f1145fa7bcdfebd68da52c
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54994
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <patrick.farrell@oracle.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
18 hours agoLU-16741 fld: rename ptlrpc_req_finished for component fld 93/54993/2
Arshad Hussain [Thu, 2 May 2024 11:24:57 +0000 (07:24 -0400)]
LU-16741 fld: rename ptlrpc_req_finished for component fld

Patch renames ptlrpc_req_finished to ptlrpc_req_put for
fld component

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I7229ccdb4a6440700c120a5d75edd018252b0b8a
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54993
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <patrick.farrell@oracle.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
18 hours agoLU-16741 ldlm: rename ptlrpc_req_finished for component ldlm 92/54992/2
Arshad Hussain [Thu, 2 May 2024 11:21:02 +0000 (07:21 -0400)]
LU-16741 ldlm: rename ptlrpc_req_finished for component ldlm

Patch renames ptlrpc_req_finished to ptlrpc_req_put for
ldlm component

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I0daff368ed1b4448f236e7f8f17e1534b3db5e58
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54992
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <patrick.farrell@oracle.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
18 hours agoLU-16741 lfsck: rename ptlrpc_req_finished for component lfsck 91/54991/2
Arshad Hussain [Thu, 2 May 2024 11:15:06 +0000 (07:15 -0400)]
LU-16741 lfsck: rename ptlrpc_req_finished for component lfsck

Patch renames ptlrpc_req_finished to ptlrpc_req_put for
lfsck component

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I57fa0bac6ecf03a6143ca8342d0fb753dc815d60
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54991
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <patrick.farrell@oracle.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
18 hours agoLU-16741 quota: rename ptlrpc_req_finished for component quota 90/54990/2
Arshad Hussain [Thu, 2 May 2024 11:11:06 +0000 (07:11 -0400)]
LU-16741 quota: rename ptlrpc_req_finished for component quota

Patch renames ptlrpc_req_finished to ptlrpc_req_put for
quota component

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I7e671d68be8c0209a7439dc9762b5b10039aa0a3
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54990
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <patrick.farrell@oracle.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
18 hours agoLU-16741 mgc: rename ptlrpc_req_finished for component mgc 89/54989/2
Arshad Hussain [Thu, 2 May 2024 11:07:12 +0000 (07:07 -0400)]
LU-16741 mgc: rename ptlrpc_req_finished for component mgc

Patch renames ptlrpc_req_finished to ptlrpc_req_put for
mgc component

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I7b7fac8b3cfc30b6b6e92f68018b494d24390a7c
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54989
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
18 hours agoLU-16741 ptlrpc: rename ptlrpc_req_finished for component ptlrpc 88/54988/2
Arshad Hussain [Thu, 2 May 2024 10:57:31 +0000 (06:57 -0400)]
LU-16741 ptlrpc: rename ptlrpc_req_finished for component ptlrpc

Patch renames ptlrpc_req_finished to ptlrpc_req_put for
ptlrpc component

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: Ic41d76ace564132a369288676398bc881048f851
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54988
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <patrick.farrell@oracle.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
18 hours agoLU-16741 mdc: rename ptlrpc_req_finished for component mdc 87/54987/2
Arshad Hussain [Thu, 2 May 2024 10:49:26 +0000 (06:49 -0400)]
LU-16741 mdc: rename ptlrpc_req_finished for component mdc

Patch renames ptlrpc_req_finished to ptlrpc_req_put for
mdc component

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I46de8facbafcabbeb5c12daefcc5172f6c9bafd5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54987
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <patrick.farrell@oracle.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
18 hours agoLU-16741 osp: rename ptlrpc_req_finished for component osp 86/54986/2
Arshad Hussain [Thu, 2 May 2024 10:40:02 +0000 (06:40 -0400)]
LU-16741 osp: rename ptlrpc_req_finished for component osp

Patch renames ptlrpc_req_finished to ptlrpc_req_put for
osp component

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I0da0f922be2a062459c14585f910ef2a6c425b14
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54986
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <patrick.farrell@oracle.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
18 hours agoLU-17797 lnet: avoid use after free of lnet ifaces 75/54975/2
Shaun Tancheff [Wed, 1 May 2024 04:39:26 +0000 (11:39 +0700)]
LU-17797 lnet: avoid use after free of lnet ifaces

Durning inet4 / inet6 enumeration the array of nids can be
reallocated for freed.

When the array is freed the originating reference should be
nulled to avoid a possible use after free.

CoverityID: 425360 ("USE_AFTER_FREE")

Test-Parameters: trivial
Fixes: ab6c8bd18 ("LU-16822 lnet: always initialize IPv6 at start up")
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Ifd751e0c2f0095b33f8b2cd8dd58cfd8572c5ff4
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54975
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
18 hours agoLU-17795 lnet: unused return code in lnet_peer_data_present 71/54971/2
Serguei Smirnov [Tue, 30 Apr 2024 17:55:29 +0000 (10:55 -0700)]
LU-17795 lnet: unused return code in lnet_peer_data_present

Coverity check detected an issue with the return code from the call to
lnet_peer_set_primary_nid() in the code added by LU-17379 patch.
Fix it.

Test-Parameters: trivial testlist=sanity-lnet
Fixes: ae6d37 ("LU-17379 lnet: parallelize peer discovery via LNetAddPeer")
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: I8b9df330200ff2732efd2a54d8de910463993fae
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54971
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
18 hours agoLU-17788 ptlrpc: restore watchdog revival message 42/54942/12
Andreas Dilger [Sat, 27 Apr 2024 02:48:15 +0000 (20:48 -0600)]
LU-17788 ptlrpc: restore watchdog revival message

Restore the "Service thread pid NNN completed after SSS.mmm
seconds.  This likely indicates the system was overloaded"
message that was lost during ptlrpc watchdog restructuring.

Do not rate limit this message, so that it is possible to see
when all threads are restored, even if their corresponding
"Service thread pid NNN was inactive" message was throttled.

Update recovery-small test_10a to check for these messages,
so that they are not removed again in the future.

Test-Parameters: testlist=recovery-small env=ONLY=10a
Test-Parameters: testlist=recovery-small env=ONLY=10a
Test-Parameters: testlist=recovery-small env=ONLY=10a
Test-Parameters: testlist=recovery-small env=ONLY=10a
Test-Parameters: testlist=recovery-small env=ONLY=10a
Fixes: fc9de679a4 ("LU-9859 libcfs: add watchdog for ptlrpc service threads.")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I0c7e96fb7f73ca5562a6f5ad780a79ffc83ebbe5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54942
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
18 hours agoLU-17786 tests: use $TSTUSR instead of hard coding quota_usr 40/54940/2
James Simmons [Fri, 26 Apr 2024 22:26:46 +0000 (18:26 -0400)]
LU-17786 tests: use $TSTUSR instead of hard coding quota_usr

The bash function check_system_is_clean() hard codes the user.
For many external system due to security we can't create special
users so use $TSTUSR instead that can already exits for us.

Change-Id: I80d522f04bc813cd6d5aef000eeeb34d6ec81ebd
Fixes: 7e1fb1a296e ("LU-17179 tests: check the system is clean")
Test-Parameters: trivial testlist=sanity-quota
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54940
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sergey Cheremencev <scherementsev@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
18 hours agoLU-17504 build: fix lock_handle array-index-out-of-bounds 26/54926/5
Andreas Dilger [Sat, 27 Apr 2024 01:13:52 +0000 (18:13 -0700)]
LU-17504 build: fix lock_handle array-index-out-of-bounds

After Linux kernel patch "ubsan: Tighten UBSAN_BOUNDS on GCC"
(commit v6.4-rc2-1-g2d47c6956ab3), flexible trailing arrays
declared like 'lock_handle[2]' will generate warnings when
CONFIG_UBSAN & co. is enabled:

    UBSAN: array-index-out-of-bounds in ldlm_request.c:1282:18
    index 2 is out of range for type 'lustre_handle [2]'

The declaration lock_handle[LDLM_LOCKREQ_HANDLES] confuses the
compiler into thinking there are only two fields in lock_handle,
but the caller often allocates extra fields beyond this for more
locks to be cancelled due to Early Lock Cancellation or from LRU.

Rather than have a second flexible array after lustre_handle[2],
declare the whole array as flexible, and fix up the few sites
that are allocating this array to ensure LDLM_LOCKREQ_HANDLES
fields are allocated at a minimum.

This subtly changes the checks in wiretest.c due to the removal
of the 2 "base" handles in ldlm_request, but I believe this is not
changing the wire protocol because it still allocates those handles
directly, and I have verified interoperability with a 2.14.0 server.

Test-Parameters: testlist=runtests clientversion=2.14
Test-Parameters: testlist=runtests serverversion=2.14
Test-Parameters: testlist=runtests clientversion=2.15
Test-Parameters: testlist=runtests serverversion=2.15
Test-Parameters: testlist=runtests clientversion=EXA5
Test-Parameters: testlist=runtests serverversion=EXA5
Test-Parameters: testlist=runtests clientversion=EXA6
Test-Parameters: testlist=runtests serverversion=EXA6
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I9695fb44f1b5c84bb750d2983cdd8b939e3ebbe5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54926
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
18 hours agoLU-17784 build: improve wiretest for flexible arrays 29/54929/2
Shaun Tancheff [Fri, 26 Apr 2024 11:24:34 +0000 (18:24 +0700)]
LU-17784 build: improve wiretest for flexible arrays

Flexible array checking can additionally probe that the size
of the array element is correct.

Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Ib7de3d156a2e77dfaf2e9ab1df8fab524c073610
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54929
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
18 hours agoLU-17741 gss: fix lsvcgss service for systemd 15/54915/3
Sebastien Buisson [Thu, 25 Apr 2024 16:42:44 +0000 (18:42 +0200)]
LU-17741 gss: fix lsvcgss service for systemd

Add a systemd unit file for lsvcgss service, so that the lsvcgssd
daemon can be handled correctly via systemctl.

Test-Parameters: trivial
Test-Parameters: kerberos=true testlist=sanity-krb5 clientdistro=el9.3 serverdistro=el9.3
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I7581996e1e28567415da0827681841ac228ad6c5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54915
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
18 hours agoLU-17774 build: pass systemdsystemunitdir to "make debs" 02/54902/3
Jian Yu [Fri, 26 Apr 2024 17:10:03 +0000 (10:10 -0700)]
LU-17774 build: pass systemdsystemunitdir to "make debs"

This patch passes "--with-systemdsystemunitdir" configure
option to the configure command performed in "make debs".
It also updates debian/lustre-{client,server}-utils.install
with the detected/specified directory for systemd service files.

Test-Parameters: trivial clientdistro=ubuntu2204

Signed-off-by: Jian Yu <yujian@whamcloud.com>
Change-Id: I7c36904ea0ed0f393a76b0fb0ad444b330dfa78c
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54902
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Aurelien Degremont <adegremont@nvidia.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
18 hours agoLU-17767 build: struct lsmcontext has slot or id member 81/54881/3
Sebastien Buisson [Tue, 23 Apr 2024 17:48:32 +0000 (10:48 -0700)]
LU-17767 build: struct lsmcontext has slot or id member

With Ubuntu 24.04 kernel 6.8.0-31-generic, the struct lsmcontext uses
a field named 'id' to identify the LSM module, instead of 'slot' in
previous kernel versions.

Fixes: 0e66489401 ("LU-16619 build: Ubuntu jammy 5.19 client support")
Test-Parameters: trivial
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I5080e60614b42ed63103f93cae1f481851742d0b
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54881
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Aurelien Degremont <adegremont@nvidia.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
18 hours agoLU-17769 tests: run_one() repeats subtests for set duration 69/54869/6
Charlie Olmstead [Mon, 22 Apr 2024 16:37:12 +0000 (10:37 -0600)]
LU-17769 tests: run_one() repeats subtests for set duration

Implement ONLY_MINUTES=M environment variable to allow test runners
to execute a subtest for at least M minutes. Each time the subtest
completes, the duration is checked to see if it has exceeded
ONLY_MINUTES, therfore the parameter represents a minimum number
of minutes to run rather than an exact duration.

If, for some reason, both ONLY_REPEAT and ONLY_MINUTES are set,
the ONLY_REPEAT value takes precedence.

Test-Parameters: trivial testlist=sanity env=ONLY=73
Test-Parameters: testlist=sanity env=ONLY=73,ONLY_REPEAT=10
Test-Parameters: testlist=sanity env=ONLY=73,ONLY_MINUTES=5
Test-Parameters: testlist=sanity env=ONLY=73,ONLY_REPEAT=100,ONLY_MINUTES=10
Test-Parameters: testlist=sanity env=ONLY=73,ONLY_REPEAT=10,ONLY_MINUTES=10
Signed-off-by: Charlie Olmstead <charlie@whamcloud.com>
Change-Id: I4b454fd8582d2b875762ee15451150afb3117d15
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54869
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
18 hours agoLU-17000 misc: fix strscpy() Coverity warnings 65/54865/6
Arshad Hussain [Mon, 22 Apr 2024 09:25:50 +0000 (14:55 +0530)]
LU-17000 misc: fix strscpy() Coverity warnings

Fix warning reported for use of uninitialized vairable

CoverityID: 425254 ("Uninitialized scalar variable")

Fix warning reported when changing call from strlcpy()
to strscpy()

CoverityID: 425253 ("Unsigned compared against 0")
CoverityID: 425262 ("Unsigned compared against 0")
Fixes: 7a0517fa2 ("LU-17592 build: kernel 6.8 removed strlcpy()")

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: Id3804c77a105e4776a0242db787dc1ca2528d9ca
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54865
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
18 hours agoLU-17761 tests: make sanity-compr sanity/sanityn return 0 55/54855/2
Jian Yu [Fri, 19 Apr 2024 18:54:04 +0000 (11:54 -0700)]
LU-17761 tests: make sanity-compr sanity/sanityn return 0

While running sanity-compr sanity/sanityn, if there was
sub-subtest failure, the sanity/sanityn test_cleanup would
be incorrectly marked as FAIL.

We should leave it to the individual sanity/sanityn subtests
to mark their failures, test_sanity() and test_sanityn()
should not also return an error.

Change-Id: I1fd645b80b92e583f1a564f85e6d2d6d871b8fa8
Test-Parameters: trivial testlist=sanity-compr
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54855
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
18 hours agoLU-14391 lnet: optimize the Netlink packet size for routes 44/54844/12
James Simmons [Fri, 26 Apr 2024 17:15:02 +0000 (13:15 -0400)]
LU-14391 lnet: optimize the Netlink packet size for routes

Currently Netlink by default sets its maximum packet size
to send back to user land to 64K. Some sites setup many
routes, above ~430, which exceed this limit. We can avoid
this limitation by calculate about the actually size of
the netlink packet and setting cb->min_dump_alloc. The
new max is then 4GB which should be plenty (27K of routes)

Test-Parameters: trivial testlist=sanity-lnet
Change-Id: Ica01f0cf290992a5d27b8ac2d09508d0a6e8151a
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54844
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
18 hours agoLU-17455 scripts: add IPv6 support to ksocklnd-config 33/54833/4
Serguei Smirnov [Wed, 17 Apr 2024 21:15:22 +0000 (14:15 -0700)]
LU-17455 scripts: add IPv6 support to ksocklnd-config

Expand ksocklnd-config script to support IPv6.
For every interface listed as the argument, check if IPv6
address is configured and set up routing accordingly.
The change replicates existing behavior for IPv4:
   - if existing route is found for the interface,
     or skip_mr_routing is enabled, the script skips
     adding a new route and prints a warning
   - if default gateway is found on the same subnet,
     a source-based rule and route are added for the
     IP/interface using the gateway
   - if default gateway is not found, a source-based rule
     and a local route are added for the IP/interface

Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: I69e249f2858a201f1b108afa05cce9fdf4ee8c80
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54833
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
18 hours agoLU-14535 utils: fix FORWARD_NULL issue from Coverity 27/54827/4
Hongchao Zhang [Sun, 14 Apr 2024 23:13:57 +0000 (07:13 +0800)]
LU-14535 utils: fix FORWARD_NULL issue from Coverity

Fixing the possible NULL pointer issued reported from Coverity

   case 'e':
CID 424708:    (FORWARD_NULL)
Passing null pointer "optarg" to "strtoul", which dereferences it.
      end_qid = strtoul(optarg, NULL, 0);
      break;

CoverityID: 424708 ("FORWARD NULL")

Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Change-Id: Idfb5cb4c6fe63ec08dd9048742f3f280b125eb8a
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54827
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
19 hours agoLU-17625 statahead: avoid to use @sai after its has been freed 26/54826/3
Qian Yingjin [Wed, 17 Apr 2024 08:22:02 +0000 (04:22 -0400)]
LU-17625 statahead: avoid to use @sai after its has been freed

There is a race between a statahead thread startup and another
statahead reqeust trying to access the same statahead structure.
But the statahead thread startup was failed and free the statahead
structure too earlier. The user stat() request will use the
statahead structure which memory has been freed already wrongly...

In this patch, we repace the @ll_sai_free/@ll_sax_free with
@ll_sai_put/@ll_sax_put to avoid freeing the statahead structure
too eariler when they were still being used by user stat()
request.

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I3840be959160aed2887a91be81da05f796306cd9
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54826
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
19 hours agoLU-17734 build: Debian: oblige --disable-tests if asked 64/54764/4
Ellis Wilson [Fri, 15 Oct 2021 20:23:25 +0000 (16:23 -0400)]
LU-17734 build: Debian: oblige --disable-tests if asked

Do not disable tests by default for debian-based builds, but permit
users to disable them if they choose by passing in --disable-tests.

Test-Parameters: trivial
Signed-off-by: Ellis Wilson <elliswilson@microsoft.com>
Change-Id: I90088e6e95fa9e46ae063dfc061a324293fde9a2
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54764
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
19 hours agoLU-17714 gss: protect against revoked session keyring 06/54706/5
Sebastien Buisson [Mon, 8 Apr 2024 15:52:50 +0000 (17:52 +0200)]
LU-17714 gss: protect against revoked session keyring

In case the session keyring is revoked, request_key() still tries to
search it. Sadly this keyring is searched before the user keyring, so
it will return -EKEYREVOKED, and the user keyring, that does contain
the Lustre key, will not even be searched.
To work around this issue in the kernel implementation of request_key,
override the current process's credentials with no session keyring,
if we detect it has been revoked.

Test-Parameters: kerberos=true testlist=sanity-krb5 serverdistro=el8.9
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I64b6ac4693a47cf43d6fa1bf4e17bfb4907670fa
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54706
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Aurelien Degremont <adegremont@nvidia.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
19 hours agoLU-17714 gss: cleanup user keyring usage 92/54692/8
Sebastien Buisson [Mon, 8 Apr 2024 09:06:50 +0000 (11:06 +0200)]
LU-17714 gss: cleanup user keyring usage

User keys are linked to the user keyring. But we should not keep an
extra reference on the user keyring for every user key being created.
This leads to too many references on this keyring, and prevents proper
destroy in case the system wants to clean it up (because the user
logged off for instance).
And when unlinking a user key, we need to take care of the user
namespace, in order to fetch the real user keyring, and not the one
associated with the mapped uid in the user namespace.
Finally we must handle the case where the user key is explicitly
revoked via 'keyctl revoke' on the command line, by carrying out the
same cleanup as when 'lfs flushctx' is called. This properly drops
references on the key, and frees the security context associated with
the key.

Test-Parameters: kerberos=true testlist=sanity-krb5 serverdistro=el8.9
Fixes: 02b456e4a4 ("LU-17173 gss: user keys go to user keyring")
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Ic168b68f8652689aa4402eaa4fcdbd852743d320
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54692
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Aurelien Degremont <adegremont@nvidia.com>
Reviewed-by: Bruno Faccini <bfaccini@nvidia.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
19 hours agoLU-17704 revert: "LU-17379 ptlrpc: fix check for callback discard" 86/54686/4
Andreas Dilger [Fri, 5 Apr 2024 22:42:48 +0000 (22:42 +0000)]
LU-17704 revert: "LU-17379 ptlrpc: fix check for callback discard"

This reverts commit a6886dba0ed8a622c9831cd33d310d933492c72d.
This is failing dbench intermittently in sanity-benchmark.

Change-Id: Id3720c79ca8dd9276e086aab5d3fcfe43ddd680a
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54686
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Stephane Thiell <sthiell@stanford.edu>
19 hours agoLU-17657 build: gcc 13 stricter enum checking 68/54468/6
Shaun Tancheff [Fri, 26 Apr 2024 15:25:19 +0000 (22:25 +0700)]
LU-17657 build: gcc 13 stricter enum checking

gcc 13 does not allow mixing of enum and integer
types between function declaration and implementation.

Cleanup a couple of instances where an enum is treated
as an uint32_t / __u32 and treat it as an enum type.

lustre/lov/lov_ea.c: In function 'lsme_unpack_comp':
lustre/lov/lov_ea.c:531:21: error: array subscript
   'struct lov_stripe_md_entry[0]' is partly outside array bounds
    of 'struct lov_stripe_md_entry[0]' [-Werror=array-bounds=]
  531 |                 lsme->lsme_magic = magic;

Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I8e2ef989ecbdebe5e13bcea0fbb210c4a14eb45e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54468
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
19 hours agoLU-17580 llite: Remove all referance of LOOKUP_CONTINUE 69/54169/7
Arshad Hussain [Sun, 25 Feb 2024 01:13:22 +0000 (06:43 +0530)]
LU-17580 llite: Remove all referance of LOOKUP_CONTINUE

Newer kernel (3.1 and beyond) LOOKUP_CONTINUE flag is
replaced/same as LOOKUP_PARENT flag. Can safely
remove any definations of LOOKUP_CONTINUE

Linux-commit: 49084c3bb2055c401f3493c13edae14d49128ca0
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I05eac0ec1321d230c7a215f95888d4040b7c670a
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54169
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: James Simmons <jsimmons@infradead.org>
19 hours agoLU-13791 mdt: allow using symbolic capability names 18/54118/3
Andreas Dilger [Wed, 21 Feb 2024 00:59:25 +0000 (17:59 -0700)]
LU-13791 mdt: allow using symbolic capability names

Allow "mdt.*.enable_cap_mask" param set and print symbolic names,
similar to the "debug" and "subsystem_debug" parameters.  The
allowed parameter names are in the capabilities(7) man page, in
either upper or lowercase, like cap_chown, cap_dac_read_search,
etc. along with "all" to enable all capabilities if clients are
trusted.  For example:

    lctl set_param -P mdt.lfs-*.enable_cap_mask=+cap_dac_read_search

Since kernel_cap_t is a 64-bit value, enhance cfs_str2mask() to
take u64 mask arguments.  The calling libcfs_debug_str2mask()
sticks with "int mask" for now.

Split the core out from libcfs_debug_mask2str() into a new helper
function cfs_mask2str() so it can be called directly.

Fixes: 54f677651b ("LU-13791 mdt: parameter to tune capabilities")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I3f71f61a17d4d3614e46a526c60e709d9eb825b3
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54118
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
19 hours agoLU-17523 ldiskfs: sync series to include el8.4 92/53992/7
Shaun Tancheff [Tue, 5 Mar 2024 02:23:33 +0000 (09:23 +0700)]
LU-17523 ldiskfs: sync series to include el8.4

el8.4 .5 and .6 include:
  rhel8/ext4-deep-tree.patch
  rhel7.6/ext4-dquot-commit-speedup.patch
  rhel8/ext4-ext-merge.patch
  rhel8/ext4-mballoc-dense.patch

el8.6 include:
  rhel8/ext4-race-in-ext4-destroy-inode.patch
  rhel8/ext4-mballoc-dense.patch

el8.7 include:
  rhel8/ext4-deep-tree.patch
  rhel8/ext4-race-in-ext4-destroy-inode.patch
  rhel8/ext4-mballoc-dense.patch

el8.8 and .9 include:
  rhel8/ext4-limit-per-inode-preallocation-list.patch

el8.9 include:
  rhel8/ext4-race-in-ext4-destroy-inode.patch
  rhel8/ext4-mballoc-dense.patch

Test-Parameters: trivial fstype=ldiskfs clientdistro=el8.9 serverdistro=el8.9 testlist=sanity
Test-Parameters: trivial fstype=ldiskfs clientdistro=el8.8 serverdistro=el8.8 testlist=sanity
Test-Parameters: optional fstype=ldiskfs clientdistro=el8.8 serverdistro=el8.7 testlist=sanity
Test-Parameters: optional fstype=ldiskfs clientdistro=el8.8 serverdistro=el8.6 testlist=sanity
Test-Parameters: optional fstype=ldiskfs clientdistro=el8.8 serverdistro=el8.5 testlist=sanity
Test-Parameters: optional fstype=ldiskfs clientdistro=el8.8 serverdistro=el8.4 testlist=sanity
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I2f5515947a16dff7f2502ec281675f56b2470ea7
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53992
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
19 hours agoLU-17483 gss: refresh req context with already existing one 59/53859/8
Sebastien Buisson [Tue, 30 Jan 2024 12:13:52 +0000 (13:13 +0100)]
LU-17483 gss: refresh req context with already existing one

When we are processing a request with a root GSS context that
has the PTLRPC_CTX_ERROR_BIT bit set, try to replace it with an
already existing context. Such a context can already be up-to-date
thanks to other authentication requests sent to failover NIDs while
the current request was in the delay list. This valid context can be
fetched from the struct ptlrpc_sec.

Test-Parameters: kerberos=true testlist=sanity-krb5
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Iff1cf727c4579cba6456e010aac6537cf888b0ae
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53859
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: Aurelien Degremont <adegremont@nvidia.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
19 hours agoLU-12885 mds: add enums for MDS_OPEN flags 69/36469/33
Andreas Dilger [Tue, 9 Apr 2024 08:22:07 +0000 (04:22 -0400)]
LU-12885 mds: add enums for MDS_OPEN flags

This patch is first of the series of patch that separates
kernel open flags from MDS open flags

The first step is to add enum mds_open_flags to the code to
make it easier to follow the logic. Rename it_flags to
it_open_flags and use enum mds_open_flags in the code so it
is clear that MDS_OPEN flags are being used.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I933a6e6102f947a9276cb6bf03826fd4a53ebbe5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/36469
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
19 hours agoLU-11085 ldlm: save space in struct ldlm_lock 31/53931/9
Mr NeilBrown [Mon, 5 Feb 2024 22:46:49 +0000 (09:46 +1100)]
LU-11085 ldlm: save space in struct ldlm_lock

Moving the 'interval' handle into ldlm_lock has made the structure
bigger.  Compensate for this by shared space for fields only needs for
specific lock types.

i.e.  some fields are only needed for EXTENT locks, some for FLOCK
locks, some for PLAIN and IBITS which use "skiplists".

One x86_64 the reduces the size of ldlm_lock to what is was before the
previous patch.  A future patch will reduce it even more.

As extent and flock both used the interval tree node, they now have
different instances.  So the names in flock are changed.  Both of
these will disappear in future patches.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Iec92a41c174e4884852ebf8fbb2cd50d4e165035
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53931
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
19 hours agoLU-11085 ldlm: simplify use of interval-tree. 21/33221/28
NeilBrown [Wed, 25 Mar 2020 02:50:16 +0000 (22:50 -0400)]
LU-11085 ldlm: simplify use of interval-tree.

The interval tree used for keeping track of extent locks is currently
separate from those locks themselves.  A separate 'ldlm_interval'
structure is allocated and linked to all locks which have the same
extent.

This requires that the interval tree library handles an insert where
exactly the same interval already exists differently from any other
insert.  No other users of the interval tree library wants this, and
the library which is part of linux doesn't support it.  So it would be
good to remove this requirement.

This patch changes the library, removes the 'ldlm_interval' structure,
and stores each lock in the tree.  This substantially simplifies a lot
of code, but has some costs.

The ldlm_lock is now larger - it contains three pointers for the
rbtree where previously it had one, and it now has an extra copy of
the range start/end.  These will be resolved in later patches by
removing duplication and sharing space with other fields that aren't
used for extent locks.

The extent-tree can now be substantially larger as it now contains
every lock for a given extent rather than each extent only once.  As
the depth of the tree grows with the log of the number of elements,
this isn't an enormous cost, but it may still be measurable.  In
particular, locks that cover the full extent [0..MAX] are common and
can swamp other locks (citation needed).  Such locks can be easily
kept in a separate list.  This will restore some of the code
complexity, but is otherwise of little cost.

Linux-commit: 71236833ad7a98b69e6e675efefbdc04a74c1d4b

Change-Id: I6c82d971aabd02bb036ac0bd27a934d48e972895
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/33221
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
19 hours agoLU-14810 lnet: ongoing push when discovery is stopped 84/54884/3
Cyril Bordage [Wed, 24 Apr 2024 02:21:53 +0000 (04:21 +0200)]
LU-14810 lnet: ongoing push when discovery is stopped

If a push is not completed when discovery thread is stopped, then we
still have ln_dc_handler used as md handler (from
lnet_peer_send_push). That leads to assert failure from
lnet_assert_handler_unused.

To fix that, we call lnet_assert_handler_unused only after the monitor
thread has been stopped. Thus, the patch for LU-17496 is not needed
anymore.

Fixes: 36b14a23a6 ("LU-17207 lnet: race b/w monitor thr stop and discovery push")
Test-Parameters: testlist=sanity-lnet env=ONLY="212 220",ONLY_REPEAT=100
Signed-off-by: Cyril Bordage <cbordage@whamcloud.com>
Change-Id: I426c37b12a3d29327a7295f528a5b875a9ac88a0
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54884
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
19 hours agoLU-17745 llite: fix the umount panic due to BDI unregister 50/54850/4
Qian Yingjin [Fri, 19 Apr 2024 02:53:10 +0000 (22:53 -0400)]
LU-17745 llite: fix the umount panic due to BDI unregister

There is a regression in the patch for LU-16954 on the old RHEL
kernel (RHEL8.2). When the Lustre is unmounted, the client gets
a crash.

In LU-16954, to avoid the remount failure, we explicitly
unregister the sysfs for the @bdi on the new kernel such as Unbutu
2204 v5.15 kernel.
However, this is not needed for the old kernel such RHEL 8.2.
In this patch, we remove the explicit unregister for the old kenel
to avoid the client crash during unmount.

Fixes: dcc1dd39a6 ("LU-16954 llite: add SB_I_CGROUPWB on super block for cgroup")
Test-Parameters: clientdistro=ubuntu2204 testlist=sanity-sec
Test-Parameters: clientdistro=el8.9 testlist=sanity-sec
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: Ic6df572744bed8994c08fb1369cc9beccbe2d87a
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54850
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Shuichi Ihara <sihara@ddn.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
19 hours agoLU-6142 osd-zfs: Fix style issues for osd_io.c 64/54264/5
Arshad Hussain [Mon, 4 Mar 2024 07:45:23 +0000 (02:45 -0500)]
LU-6142 osd-zfs: Fix style issues for osd_io.c

This patch fixes issues reported by checkpatch
for file lustre/osd-zfs/osd_io.c

Test-Parameters: trivial fstype=zfs
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: Ia9153be34a1d583195e3ecfc56ca4ab279781566
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54264
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
19 hours agoLU-17743 ko2iblnd: move to struct lnet_nid 71/54771/6
James Simmons [Thu, 25 Apr 2024 23:00:24 +0000 (19:00 -0400)]
LU-17743 ko2iblnd: move to struct lnet_nid

Move all non wire data structures using lnet_nid_t to
struct lnet_nid. This is the first step to support
IPv6 / GUID.

Test-Parameters: trivial testlist=sanity-lnet
Change-Id: I9d1281a1b7ab7bda566369be2bc5f07ba3ce17f9
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54771
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
19 hours agoLU-13814 osc: Remove osc delete for transient pages 79/52079/19
Patrick Farrell [Fri, 23 Feb 2024 16:16:42 +0000 (11:16 -0500)]
LU-13814 osc: Remove osc delete for transient pages

Transient pages do not need an extra reference for being
part of a transfer, because they are referenced throughout
by cl_io.  This requires a tweak to the page completion
behavior.

This allows us to remove osc_page_delete for transient
pages.

Signed-off-by: Patrick Farrell <patrick.farrell@oracle.com>
Change-Id: I96539731f972b19830b2e08bf0f1d1f1e9674241
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52079
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
19 hours agoLU-13814 osc: specialize osc_page_delete 78/52078/20
Patrick Farrell [Fri, 23 Feb 2024 16:05:35 +0000 (11:05 -0500)]
LU-13814 osc: specialize osc_page_delete

Nearly all of osc_page_delete is only done for cacheable pages,
so make that explicit.  osc_lru_del() doesn't do anything because
transient pages can't go in the LRU.  In osc_teardown_async_page(),
the latter side of the if statement is a search in cache, so it
never finds the page, then the earlier part is a check that the
page isn't in an RPC.  That's not really possible for DIO pages
unless something is *really* off.

Signed-off-by: Patrick Farrell <patrick.farrell@oracle.com>
Change-Id: I998fc196c276aa97829f5b368e23aa4b7a797294
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52078
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
19 hours agoLU-17524 llite: DIO and writev and readv syscalls 96/53996/19
Shaun Tancheff [Wed, 24 Apr 2024 22:24:44 +0000 (18:24 -0400)]
LU-17524 llite: DIO and writev and readv syscalls

Linux kernel v3.15-rc4-329-g62a8067a7f35
  bio_vec-backed iov_iter
Introduced iov_iter_get_pages_alloc

In kernels prior to iov_iter_get_pages_alloc the family
of iovec iter syscalls such as readv and writev fail to
interate over the the iovec segments.

In this case the iter() handler should submit the iovec
while looping over the segments.

Linux kernel v5.19-10287-gfcb14cb1bdac
  new iov_iter flavour - ITER_UBUF

This introduce user_backed_iter() and provide a user_backed_iter
for older kernels.

Fixes: 0006eb3644 ("LU-16328 llite: migrate_folio, vfs_setxattr")
Fixes: 044503492c ("LU-6260 llite: add support for new iter functionality")
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Idec6a956918a1744f2801ffce9b40acb2c074523
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53996
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Patrick Farrell <patrick.farrell@oracle.com>
Reviewed-by: xinliang <xinliang.liu@linaro.org>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
19 hours agoLU-16822 tests: Update sanity-lnet router tests for IPv6 28/53728/6
Chris Horn [Thu, 25 Apr 2024 17:36:25 +0000 (13:36 -0400)]
LU-16822 tests: Update sanity-lnet router tests for IPv6

Modify sanity-lnet test cases that test routing to work with IPv6
NIDs.

test_100/102/105/106:
  - Modified to use setup_router_test() to create a real router and
    use the associated LNet configuration in their tests.
test_101/103:
  - These test cases exercise the NID range functionality. They are
    skipped under IPv6 config

Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I47b23e9c63d74d937cae7c7b8b1b27dd383fc0dc
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53728
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 weeks agoNew tag 2.15.63 2.15.63 v2_15_63
Oleg Drokin [Thu, 2 May 2024 05:05:18 +0000 (01:05 -0400)]
New tag 2.15.63

Change-Id: I2ceb1e0afe9bd966555579b5d70bd263016884e2
Signed-off-by: Oleg Drokin <green@whamcloud.com>
3 weeks agoLU-17504 build: fix gcc-13 [-Werror=stringop-overread] error 34/54834/6
Shaun Tancheff [Thu, 25 Apr 2024 17:57:36 +0000 (00:57 +0700)]
LU-17504 build: fix gcc-13 [-Werror=stringop-overread] error

This patch fixes the following [-Werror=stringop-overread] and
[-Werror=attribute-warning] errors detected by gcc 13:

lustre/mgc/mgc_request.c:190:21: error: 'strcmp' reading 1 or
more bytes from a region of size 0 [-Werror=stringop-overread]
  190 | if (strcmp(logname, cld->cld_logname) == 0) {
      |     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

In function 'fortify_memcpy_chk',
    inlined from 'class_handle_ioctl' at
/root/lustre-release/lustre/obdclass/class_obd.c:381:3:
include/linux/fortify-string.h:528:25: error:
call to '__write_overflow_field' declared with attribute warning:
detected write beyond size of field (1st parameter);
maybe use struct_group()? [-Werror=attribute-warning]
  528 |  __write_overflow_field(p_size_field, size);
      |  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Jian Yu <yujian@whamcloud.com>
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I59f5a88b4cd64c9f4e67e568546baada371543b1
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54834
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 weeks agoLU-17587 build: use kernel version from dkms for client 30/54830/2
snehring [Wed, 17 Apr 2024 16:09:17 +0000 (11:09 -0500)]
LU-17587 build: use kernel version from dkms for client

The current behavior of the dkms build for clients is to only build
for the running kernel. This is fine if the other kernels are ABI
compatible with the running kernel because we tell dkms to run
weak-updates as part of the install process. However, if kernels that
are not ABI compatible with the running kernel are installed they
won't be targeted and weak-updates won't add in the modules. This
could be worked around by running 'dkms install' once booted into the
new kernel, but that's additional administrator overhead and not the
assumed behavior for a dkms module.

This modifies the dkms build script to accept the kernel version from
dkms and configure for that version. It also changes the behavior of
dkms wrt lustre to disable weak module updates since we're now
building for individual kernel versions. This will likely result in
longer times to install the client since we're building for each
installed version of the kernel, but it _should_ mean the client is
actually installed for each version.

Signed-off-by: snehring <snehring@iastate.edu>
Change-Id: I55fb1bb7159772d7ecd9d1837e870c7097c02d78
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54830
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 weeks agoLU-17736 tests: Fix sanityn/73 for test machines with auditd 09/54809/3
Ellis Wilson [Fri, 8 Oct 2021 14:27:39 +0000 (10:27 -0400)]
LU-17736 tests: Fix sanityn/73 for test machines with auditd

getfattr performs one stat followed by two getxattr syscalls against
the provided file.  Normally, the stat results in no getxattr calls
internally (as it's not something stat is required to return).

However, if auditd is enabled AND one of the rules includes a
filesystem-specific rule such as watch directory X and record if it's
modified, then for every lookup (each of the three syscalls includes
one) an additional getxattr will be performed, resulting in 5 total
getxattrs.

Because there is significant fuzz here, revise the check to be
at minimum the two "expected" getxattrs but allow for more.
Comments have been added explaining this.

Signed-off-by: Ellis Wilson <elliswilson@microsoft.com>
Test-Parameters: trivial testlist=sanityn env=ONLY=73,ONLY_REPEAT=10
Change-Id: I0da5c2a5331f7dba4e65051a073e2bec05327a25
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54809
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 weeks agoLU-17362 build: Update ZFS version to 2.1.15 69/54769/2
Jian Yu [Fri, 12 Apr 2024 15:46:44 +0000 (08:46 -0700)]
LU-17362 build: Update ZFS version to 2.1.15

Update ZFS version to 2.1.15. The changes are listed in:
https://github.com/openzfs/zfs/releases/tag/zfs-2.1.15

Test-Parameters: trivial fstype=zfs mdtcount=4 mdscount=2 \
  clientdistro=el8.9 serverdistro=el8.9 testlist=sanity

Test-Parameters: trivial fstype=zfs mdtcount=4 mdscount=2 \
  clientdistro=el9.3 serverdistro=el9.3 testlist=sanity

Test-Parameters: optional fstype=zfs testgroup=full-dne-zfs-part-1
Test-Parameters: optional fstype=zfs testgroup=full-dne-zfs-part-2
Test-Parameters: optional fstype=zfs testgroup=full-dne-zfs-part-3

Change-Id: I51532dbf9dbcadf64bb9dbd3b10e88d0cab38ffd
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54769
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Peter Jones <pjones@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 weeks agoLU-17727 tests: add to auster --stop-on-error option 55/54755/3
Xiaolin (Charlene) Zang [Thu, 8 Jul 2021 04:32:40 +0000 (00:32 -0400)]
LU-17727 tests: add to auster --stop-on-error option

add to auster --stop-on-error option, a comma separated list of tests.

If any such test fails, auster will exit immediately without any
cleanup to make debugging particularly difficult and rare bugs more
tractable.

Signed-off-by: Xiaolin (Charlene) Zang <xiaolinzang@microsoft.com>
Change-Id: Icd8d1eaf8ae799bd74f9147ac9080a0950977526
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54755
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Charlie Olmstead <charlie@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 weeks agoLU-17497 tests: skip sanity-sec/69 for old MDS 82/54782/2
Andreas Dilger [Sun, 14 Apr 2024 07:43:08 +0000 (01:43 -0600)]
LU-17497 tests: skip sanity-sec/69 for old MDS

Older MDS versions do not have strict checking for identity_upcall
or rsi_upcall, don't run the test with those servers.

Test-Parameters: trivial testlist=sanity-sec env=ONLY=69 serverversion=2.15
Fixes: 2153e86541 ("LU-17497 obdclass: check upcall incorrect values")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Icdfda82eca32c2de7e88991ead0d9723023ebbe5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54782
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Zhenyu Xu <bobijam@hotmail.com>
Reviewed-by: Sarah Liu <sarah@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 weeks agoLU-16741 lvm: rename ptlrpc_req_finished for component lvm 93/54693/2
Arshad Hussain [Mon, 8 Apr 2024 10:51:37 +0000 (06:51 -0400)]
LU-16741 lvm: rename ptlrpc_req_finished for component lvm

Patch renames ptlrpc_req_finished to ptlrpc_req_put for
lvm component

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I58dd90e4ae1a8834866491bf866cbacbd1c6e609
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54693
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 weeks agoLU-17706 lnet: reserve TOFULND and EFALND 74/54674/2
Andreas Dilger [Thu, 4 Apr 2024 18:42:02 +0000 (12:42 -0600)]
LU-17706 lnet: reserve TOFULND and EFALND

Reserve network numbers for Fujitsu Torus Fusion LND and Amazon
Elastic Fabric Adapter LND to avoid hard-to-fix conflicts in the
future.

Add comments for the other LND numbers to provide some context.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Icea6cecf5a951c5a44527c937a2631c9cc3ebbe5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54674
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Shuichi Ihara <sihara@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 weeks agoLU-17703 lod: check the inherited pool for conflicts 61/54661/5
Vitaly Fertman [Wed, 3 Apr 2024 20:33:20 +0000 (23:33 +0300)]
LU-17703 lod: check the inherited pool for conflicts

In addition to LU-15658, the start index could be inherited from
parent and the pool from root: drop the pool in case of conflict
as well.

Another case of a problem inheritance is saving the inherited LOVEA
to subdir, when all the parameters are inherited but the ost list.

HPE-bug-id: LUS-11330, LUS-11631
Signed-off-by: Vitaly Fertman <vitaly.fertman@hpe.com>
Change-Id: Ief1dbd8c1ee0433bb625cbff1834b248d4fb2992
Reviewed-on: https://es-gerrit.hpc.amslabs.hpecorp.net/161800
Tested-by: Alexander Lezhoev <alexander.lezhoev@hpe.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54661
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Andriy Skulysh <andriy.skulysh@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 weeks agoLU-17669 test: using unintialized variable in sanity:160n 49/54549/2
Li Xi [Mon, 25 Mar 2024 02:20:35 +0000 (10:20 +0800)]
LU-17669 test: using unintialized variable in sanity:160n

This patch fix a simple typo of unintialized variable.

Fixes: d813c75df ("LU-14688 mdt: changelog purge deletes plain llog")

Test-Parameters: trivial testlist=sanity env=ONLY=160n
Change-Id: I2e29cce33733c925dfe9a53c06af7ac17b2c6be3
Signed-off-by: Li Xi <lixi@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54549
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 weeks agoLU-930 ptlrpc: quiet idle import logging 40/54540/2
Andreas Dilger [Fri, 22 Mar 2024 23:20:38 +0000 (16:20 -0700)]
LU-930 ptlrpc: quiet idle import logging

Don't log a debug message for every idle import every 25s, as this
pushes out other more important messages from the logs.

Fixes: 5a6ceb664f ("LU-7236 ptlrpc: idle connections can disconnect")
Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Id98c2acad07cec62af0d705a437a4d2915ce9f62
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54540
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 weeks agoLU-17431 utils: add 'dynamic' parameter to nodemap_cmd 03/54503/10
Sebastien Buisson [Wed, 20 Mar 2024 08:05:41 +0000 (09:05 +0100)]
LU-17431 utils: add 'dynamic' parameter to nodemap_cmd

Adding a 'dynamic' parameter to nodemap_cmd() will enable
'lctl nodemap_*' commands to handle dynamic nodemaps, i.e.
nodemaps created directly on MDS/OSS side, and stored in memory.

If both MDT and OST are running on the same node, the MDS device
is used for the ioctl.  It doesn't matter which one is actually
used, since it gets to the same place in ptlrpc anyway, it just
needs to find a valid OBD device to run the ioctl.

Test-Parameters: trivial
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Id58199e1ad6622aad896737604c0a8e1287ba34e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54503
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 weeks agoLU-17431 nodemap: add function to know if nodemap is on MGS 06/54506/8
Sebastien Buisson [Wed, 20 Mar 2024 08:33:11 +0000 (09:33 +0100)]
LU-17431 nodemap: add function to know if nodemap is on MGS

Adding nodemap_mgs() function allows to know if nodemaps are defined
on an MGS node (pointer to a nodemap config file) or not.

Test-Parameters: trivial
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Id87e34dd8d13cd21c88c87ef9e8e91ff9ff142c8
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54506
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 weeks agoLU-17627 build: fix new mofed version 36/54336/5
Minh Diep [Wed, 6 Mar 2024 02:26:58 +0000 (18:26 -0800)]
LU-17627 build: fix new mofed version

Allow multi-digit MOFED version numbers.
Fix compare_version function to return what it should

Test-Parameters: trivial
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Change-Id: I0f585cb355bb34270003ae1139688080c301186a
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54336
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
3 weeks agoLU-17592 build: kernel 6.8 -Werror=missing-prototypes 28/54228/13
Shaun Tancheff [Mon, 15 Apr 2024 18:29:30 +0000 (11:29 -0700)]
LU-17592 build: kernel 6.8 -Werror=missing-prototypes

Linux commit v6.7-rc4-156-g0fcb70851fbf
  Makefile.extrawarn: turn on missing-prototypes globally

With -Wmissing-prototypes and -Werror cleanup some additional
funtions that are implicitly static and provide declarations
for those that are exported.

Add SERVER_ONLY and SERVER_ONLY_EXPORT_SYMBOL to wrap functions
that are only exported for and used by server components.

Test-Parameters: trivial
HPE-bug-id: LUS-12181
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Ice5219df5463effe964d2cd2114f003d185337da
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54228
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 weeks agoLU-17379 lnet: parallelize peer discovery via LNetAddPeer 33/53933/10
Serguei Smirnov [Tue, 6 Feb 2024 03:24:01 +0000 (19:24 -0800)]
LU-17379 lnet: parallelize peer discovery via LNetAddPeer

Initiate peer discovery via its non-primary NIDs
as they are being added in LNetAddPeer by pretending
that they belong to different peers. This may be
useful if some of the comma-separated NIDs in the
mount command (including the first listed NID) are down.
If discovery is performed in the background and there's
at least one reachable NID in the list, the discovery
will succeed and peer records will get consolidated.

If primary NID locking is enabled, The first NID in the list
provided by Lustre to LNetAddPeer always gets locked as primary:
even if it doesn't get discovered.

Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: I449cb9898c0242db874555a62fe8099352e913e6
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53933
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
3 weeks agoLU-10717 tests: minor fixes to conf-sanity 55/53455/5
Andreas Dilger [Thu, 14 Dec 2023 04:13:29 +0000 (21:13 -0700)]
LU-10717 tests: minor fixes to conf-sanity

Remove use of fancy quotation marks in conf-sanity test_102.
Quiet other minor shellcheck warnings in test_30a and test_84.
Fix incorrect variable in error message in test_133.

Test-Parameters: trivial testlist=conf-sanity env=ONLY="30a 84 102"
Fixes: aa9f9344fc ("LU-10717 tests: tests should not start mgs")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I9825157c6b72addc6883e8bc44aea53b483ebbe5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53455
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 weeks agoLU-14361 statahead: wait inuse entry finished during cleanup 91/49291/18
Qian Yingjin [Thu, 1 Dec 2022 02:43:50 +0000 (21:43 -0500)]
LU-14361 statahead: wait inuse entry finished during cleanup

If the entry is being used by the user process when the statahead is
doing cleanup and quit, it must wait for the inuse entry finished
and then kill the local cached entries in statahead context.

Add sanity/test_123{k,l} to verify it.

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I747badd85bd44cb20f7d37ca3126ca308a632371
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49291
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 weeks agoLU-14361 statahead: add support for mdtest shared dir workload 54/48954/18
Qian Yingjin [Wed, 26 Oct 2022 03:01:57 +0000 (23:01 -0400)]
LU-14361 statahead: add support for mdtest shared dir workload

This patch adds statahead support for shared dir stat() workload
with fname patteren like mdtest shared dir stat() access.

The performance imporvements are shown as follows:
IO500 (KIOPS) w/o patch w/ path
mdtest-easy-stat 740.01 1276.31
mdtest-hard-stat 514.36 1105.33

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I43983e91eb864bd317cfb883e35e2f4c1a8f788c
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48954
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 weeks agoLU-14361 statahead: return ENOENT for batched statahead 87/51587/21
Qian Yingjin [Thu, 6 Jul 2023 03:41:46 +0000 (23:41 -0400)]
LU-14361 statahead: return ENOENT for batched statahead

When stat on a non-existing file in a batched statahead context,
MDT should return -ENOENT immediately and stop the statahead work.

Otherwise, the client may cache the parent inode with UPDATE lock
and the non-existing dentry under the protection of the parent's
UPDATE lock wrongly.

Add sanity/test_123j to verify it.

Test-Parameters: clientdistro=el9.2 testlist=sanity env=ONLY=123i,ONLY_REPEAT=10
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: Ia4618f605d2f38ce712e421bcd7b96688bbfbb32
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51587
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 weeks agoLU-15277 utils: lfs quota/setquota improvements 15/46615/11
Andreas Dilger [Fri, 25 Feb 2022 08:23:02 +0000 (01:23 -0700)]
LU-15277 utils: lfs quota/setquota improvements

Add long options to "lfs quota" for ease of use.  Improve usage
message for "lfs quota" and "lfs setquota" to match current code.

Deprecate the "lfs quota -i MDT_IDX|-I OST_IDX" options to print one
target, since these arguments are backward from other lfs subcommands.
Add "-o" for the OST_IDX and "-m" for the MDT_IDX and long options
"--ost" and "--mdt" to match other lfs subcommands.  We may eventually
be able to liberate "-i" to use the OST_IDX, but not for a while yet.

Fix "lfs setquota" handling of long --times option.  It was being
checked in lfs_setquota_times(), but not in has_times_option().

Sort arguments to be handled (as much as possible) in alphabetical
order for ease of use in the future.

Update lfs-quota.1 and lfs-setquota.1 man pages to describe all
options and add proper argument formatting.

Test-Parameters: trivial testlist=sanity-quota
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I049f22a526469ea1ed1da04beffda6bb683ebbe6
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/46615
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sergey Cheremencev <scherementsev@ddn.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 weeks agoLU-15277 quota: don't print extra default quota info 25/45725/14
Hongchao Zhang [Wed, 20 Mar 2024 14:14:27 +0000 (22:14 +0800)]
LU-15277 quota: don't print extra default quota info

While getting quota info by "lfs quota", it's better to include
default quota to the quota output of the specific quota ID.

Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Change-Id: I6726888b8857f9a45a96c83db0a546b29507cf8a
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/45725
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Sergey Cheremencev <scherementsev@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 weeks agoLU-13577 wbc: reimplement mkdir() by using intent lock 47/38647/32
Qian Yingjin [Mon, 18 May 2020 07:18:08 +0000 (15:18 +0800)]
LU-13577 wbc: reimplement mkdir() by using intent lock

This patch reworks mkdir() by using intent lock.
Instead of reint mkdir implementation without any lock returned,
a ibits lock (current PR LOOKUP|PERM) is granted to the client and
cached on the client-side lock namespaces by the mkdir() intent
lock request.

This is also a basic requirement for the coming WBC feature, i.e,
create a new directory and an EX WBC lock is returned from MDT in
intent lock request, then this root WBC directory can be safely
cached on the client under the protection of the root WBC EX lock.

This patch also adds a tuning parameter "llite.*.intent_mkdir" to
enable or disable mkdir() by using intent lock. It is set with 0
by default to disable intent mkdir().

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I94e4c2f8262d7ffb27d85b5569070049a47354d7
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/38647
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 weeks agoLU-16915 tests: improve distro type checking 90/54790/5
Andreas Dilger [Fri, 12 Apr 2024 01:18:28 +0000 (19:18 -0600)]
LU-16915 tests: improve distro type checking

Improve lustre_os_release() infrastructure to reduce redundant
code and make it easier to use.

Test-Parameters: trivial
Test-Parameters: testlist=sanity-sec env=ONLY=51,HONOR_EXCEPT=y serverdistro=el9.3
Test-Parameters: testlist=sanity env=ONLY=906,HONOR_EXCEPT=y serverdistro=el9.3
Fixes: b881bd1051 ("LU-16915 tests: except sanity-sec test_51")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Id02223752df4eb3fd3b62b339e8c417eb33ebbe5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54790
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 weeks agoLU-17025 llapi: restore 'pool=ignore' functionality 55/54355/8
Rajeev Mishra [Mon, 11 Mar 2024 18:53:11 +0000 (18:53 +0000)]
LU-17025 llapi: restore 'pool=ignore' functionality

Changes to llapi_stripe_param_verify() and related llapi file
creation functions to verify that the given pool name is valid
introduced a bug that disallowed the 'ignore' pool name, which
is used to create files without any pool name.

Allow the reserved pool names from lov_pool_is_reserved() to be
used even (especially!) if the named pool does not exist.

Revert the changes to ost-pools.sh::test_32() that created the
'ignore_pool' pool, and go back to checking that 'ignore' will
create a file that does not use any pool.

Change the pool name validation to only do fsname lookup if the
pool name is actually specified, instead of looking up fsname
but not actually using it for anything.

Fixes: ee7dfc5ad1 ("LU-17025 llapi: Verify stripe pool name")
Signed-off-by: Rajeev Mishra <rajeevm@hpe.com>
Change-Id: I9368f28a41fd9af6b6f0e9468df0e7dfd728db1c
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54355
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 weeks agoLU-6142 osd-zfs: Fix style issues for osd_lproc.c 66/54266/3
Arshad Hussain [Mon, 4 Mar 2024 06:53:11 +0000 (01:53 -0500)]
LU-6142 osd-zfs: Fix style issues for osd_lproc.c

This patch fixes issues reported by checkpatch
for file lustre/osd-zfs/osd_lproc.c

Test-Parameters: trivial fstype=zfs
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: Icb7b2a5805cddbd14458ed71835f5e12f14d18ea
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54266
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 weeks agoLU-6142 osd-zfs: Fix style issues for osd_internal.h 63/54263/4
Arshad Hussain [Mon, 4 Mar 2024 08:04:40 +0000 (03:04 -0500)]
LU-6142 osd-zfs: Fix style issues for osd_internal.h

This patch fixes issues reported by checkpatch
for file lustre/osd-zfs/osd_internal.h

Test-Parameters: trivial fstype=zfs
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: Iba857ae53a1a579dfc3ef6e422bcb3c47dd88cf1
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54263
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 weeks agoLU-6142 osd-zfs: Fix style issues for osd_handler.c 62/54262/4
Arshad Hussain [Mon, 4 Mar 2024 09:00:15 +0000 (04:00 -0500)]
LU-6142 osd-zfs: Fix style issues for osd_handler.c

This patch fixes issues reported by checkpatch
for file lustre/osd-zfs/osd_handler.c

Test-Parameters: trivial fstype=zfs
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: Ibf1d954b8c1e3e64d3ae1661cfecbb09569ba955
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54262
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 weeks agoLU-6142 osd: Fix style issues for osd_object.c 57/54257/3
Arshad Hussain [Sun, 3 Mar 2024 16:55:29 +0000 (22:25 +0530)]
LU-6142 osd: Fix style issues for osd_object.c

This patch fixes issues reported by checkpatch
for file lustre/osd-zfs/osd_object.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I32a91583b37752a722cf558dfa14f191163090b3
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54257
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 weeks agoLU-6142 osd: Fix style issues for osd_xattr.c 56/54256/2
Arshad Hussain [Sun, 3 Mar 2024 17:20:40 +0000 (22:50 +0530)]
LU-6142 osd: Fix style issues for osd_xattr.c

This patch fixes issues reported by checkpatch
for file lustre/osd-zfs/osd_xattr.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I446e990ba4865943d17087beaf8e53082bae9131
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54256
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 weeks agoLU-6142 llite: Fix style issues for rw.c 41/54141/2
Arshad Hussain [Thu, 22 Feb 2024 06:39:08 +0000 (12:09 +0530)]
LU-6142 llite: Fix style issues for rw.c

This patch fixes issues reported by checkpatch
for file lustre/llite/rw.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I7acdf52f598d26d7b54b5c63384c99ea14fa6e26
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54141
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 weeks agoLU-17250 mgs: generate a new MDT configuration by copy 14/53614/11
Etienne AUJAMES [Mon, 8 Jan 2024 15:06:08 +0000 (16:06 +0100)]
LU-17250 mgs: generate a new MDT configuration by copy

The configuration for a new MDT is generated by reading the client
configuration. The MGS filter existing mdc/osc, interpret the
records and then create the corresponding osp/osc device for the MDT.

The main idea of this patch is first to convert and copy the records
from the client configuration to create the new MDT.
And then, copy the remaining record sections from an existing MDT.
So the new MDT can inherit OST pools and parameters from the existing
one.

This avoids complex compatibility checks for IPv4/v6 NID because
add_uuid records are copied without need to parse NIDs.
This also allows to copy "add failnid" section from the client.

This patch extend the usage to "add failnid" section on MDT
configurations.

Here are the steps to copy a existing MDT configuration:

1/ read client configuration and generate osp MDT/OST records for the
   new MDT
1/ find an existing MDT configuration
2/ copy and convert the remaining configuration records from the
   existing MDT configuration (parameters and OST pools)

Add the regresion test conf-sanity 137.

Test-Parameters: mdtcount=4 fstype=zfs testlist=conf-sanity
Test-Parameters: mdtcount=4 fstype=ldiskfs testlist=conf-sanity
Test-Parameters: mdtcount=4 fstype=zfs testlist=conf-sanity env=ONLY=137,ONLY_REPEAT=10
Test-Parameters: mdtcount=4 fstype=ldiskfs testlist=conf-sanity env=ONLY=137,ONLY_REPEAT=10
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Change-Id: I4a99085b8930a0dd8002bde87d4e8c575aaccba0
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53614
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 weeks agoLU-4341 tests: re-enable SLES sanity test_170 test_243 77/54777/6
Andreas Dilger [Sun, 14 Apr 2024 03:08:48 +0000 (21:08 -0600)]
LU-4341 tests: re-enable SLES sanity test_170 test_243

Re-enable tests on SLES that has been disabled since SLES11.
The SLES version check was broken and these were already
running on SLES15 without issues.

Test-Parameters: trivial
Test-Parameters: testlist=sanity env=ONLY="170 243",ONLY_REPEAT=20 clientdistro=sles15sp5
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I0f837ac5180d0754b67f349592503267aa2c5f52
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54777
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Sarah Liu <sarah@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 weeks agoLU-16822 lnet: always initialize IPv6 at start up 18/51818/6
James Simmons [Mon, 22 Apr 2024 14:08:11 +0000 (10:08 -0400)]
LU-16822 lnet: always initialize IPv6 at start up

Currently lnet_inet_enumerate() has a bool parameter that enables
collecting IPv6 addresses for selection which is optional. This
patch changes the behavior to always collect proper IPv6 addresses
and now the bool flag means prefer the IPv6 over any IPv4 addresses.
Update the user land applications lctl and lnetctl to send a flag
to select IPv6 or IPv4 at initialization of the LNet stack. This
is useful for IPv6 and other large NID type testing.

Test-Parameters: trivial testlist=sanity-lnet
Change-Id: Ib3f38de15b1295ec1f8e8607dbd971583541f06c
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51818
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 weeks agoLU-17744 ldiskfs: mballoc stats fixes 72/54772/3
Alexander Zarochentsev [Sun, 31 Mar 2024 20:21:56 +0000 (20:21 +0000)]
LU-17744 ldiskfs: mballoc stats fixes

Change mballoc statistics to use correct
allocation loop ids.

Fixes: 95f8ae56774 ("LU-12103 ldiskfs: don't search large block range if disk full")
HPE-bug-id: LUS-11936
Signed-off-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Change-Id: I892ead5355865ec9c07fdc758e127c711b42cb1b
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54772
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 weeks agoLU-17724 gss: fix bad use of user buffer in rsi upcall 30/54730/3
Sebastien Buisson [Thu, 11 Apr 2024 06:58:19 +0000 (08:58 +0200)]
LU-17724 gss: fix bad use of user buffer in rsi upcall

Use the proper kernel buffer to print message out when
upcall_cache_set_upcall() returns an error.

Fixes: 2153e86541 ("LU-17497 obdclass: check upcall incorrect values")
Test-Parameters: trivial
Test-Parameters: testgroup=review-dne-selinux-ssk-part-2
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Ice781b4506822f1fd4ce0a062ce742f51e366525
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54730
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 weeks agoLU-17717 tests: skip sanity-lnet/252 for interop 07/54707/4
Alex Zhuravlev [Tue, 9 Apr 2024 10:14:01 +0000 (13:14 +0300)]
LU-17717 tests: skip sanity-lnet/252 for interop

as the subtest fails finding the memory leak which has been
fixed recently.

Test-Parameters: trivial
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Ide80e0b39a053a2774804b025306ebdb1fc964a8
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54707
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 weeks agoLU-17713 mdd: validate the length of mdd append_pool name 91/54691/6
Emoly Liu [Wed, 10 Apr 2024 09:18:03 +0000 (09:18 +0000)]
LU-17713 mdd: validate the length of mdd append_pool name

Validate the length of mdd append_pool name (<= LOV_MAXPOOLNAME)
before saving it in function append_pool_store().
Also, sanity.sh test_27M is improved a little to verify this fix.

Signed-off-by: Emoly Liu <emoly@whamcloud.com>
Change-Id: Id7083fab60e9a18af4d8eedfa3d55f37544ba15d
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54691
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Zhenyu Xu <bobijam@hotmail.com>
4 weeks agoLU-17297 tests: recovery-small_156 interop check 46/54646/3
Sergey Cheremencev [Mon, 1 Apr 2024 19:31:15 +0000 (22:31 +0300)]
LU-17297 tests: recovery-small_156 interop check

Don't start recovery-small_156 "tot_granted miscount
after client eviction" with OSTs less than 2.15.60.

Test-Parameters: trivial testlist=recovery-small env=ONLY=156 serverversion=2.15
Fixes: 9df01eee75 ("LU-17297 grant: move tgt_grant_sanity_check() calls")
Signed-off-by: Sergey Cheremencev <scherementsev@ddn.com>
Change-Id: I800ac435dcba267b9a60a919d007428bb8af7f90
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54646
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Vladimir Saveliev <vladimir.saveliev@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 weeks agoLU-17692 flock: get extra reference for lockd 22/54622/5
Yang Sheng [Thu, 28 Mar 2024 19:54:06 +0000 (03:54 +0800)]
LU-17692 flock: get extra reference for lockd

We should get local locking first for GETLK. Else
the lock_owner could be released while working with
lockd.

Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: I56e4204e315c2bdbc496b7961519ae45ab1820fe
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54622
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 weeks agoLU-17688 ofd: access log to release chardev 06/54606/6
Alex Zhuravlev [Thu, 28 Mar 2024 11:35:59 +0000 (14:35 +0300)]
LU-17688 ofd: access log to release chardev

due to missing put_device() OFD access log leaks number of structures.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I36109738201b98025bbd2e6ed7c8830044e505c2
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54606
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 weeks agoLU-17684 mdt: lprocfs_mdt_open_files_seq_open() leaks op_data 91/54591/7
Alex Zhuravlev [Wed, 27 Mar 2024 18:54:01 +0000 (21:54 +0300)]
LU-17684 mdt: lprocfs_mdt_open_files_seq_open() leaks op_data

op_data is allocated in single_open() and paired single_close()
is supposed to free it, but instead seq_release() was used.

same for ldlm_granted_fops.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I91846ea7a2c896cb57b878905db4f3630939a652
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54591
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 weeks agoLU-17672 ldiskfs: release s_mb_prealloc_table 53/54553/10
Alex Zhuravlev [Mon, 25 Mar 2024 07:14:46 +0000 (10:14 +0300)]
LU-17672 ldiskfs: release s_mb_prealloc_table

at umount

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I0a5cbf646c9bd73461691c49c6e7a509acd5a500
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54553
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
4 weeks agoLU-17650 gss: fix use out of bounds in ptlrpc_gss 52/54452/6
Oleg Drokin [Tue, 19 Mar 2024 03:10:13 +0000 (23:10 -0400)]
LU-17650 gss: fix use out of bounds in ptlrpc_gss

KASAN highlighted that the sockaddr_un struct is not enough
for the kernel primitives we use, so we have to use the
bigger sockaddr_storage for allocation, alas the field
names inside are different so we have to jump through some
hoops to make it actually work.
Also for a 128 byte allocation on stack variable is fine and
cannpot fail, so convert to that

Change-Id: I2292900b54756bf39530c96f7c5c228835562bef
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54452
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
4 weeks agoLU-17630 osc: add cond_resched() to osc_lru_shrink() 46/54346/6
Alex Zhuravlev [Mon, 11 Mar 2024 07:42:24 +0000 (10:42 +0300)]
LU-17630 osc: add cond_resched() to osc_lru_shrink()

osc_lru_shrink() may need to handle lots of pages and this way
can block scheduling for long. add couple cond_resched() to
prevent kernel warnings and other thread's starvation.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I862c568ac777c0b929a1ffb61e246b079aee6718
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54346
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <patrick.farrell@oracle.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 weeks agoLU-17000 utils: Fix do_warn_interval resource leak 30/54330/8
Arshad Hussain [Fri, 8 Mar 2024 10:12:26 +0000 (15:42 +0530)]
LU-17000 utils: Fix do_warn_interval resource leak

In function do_warn_interval 'fd' opened was not closed
in case write() returned error. This leak is fixed by
calling close() before returning

This patch also checks the return from futimens() and
logs an error in case it fails

CoverityID: 415056 ("Resource leak")
Fixes: a454c9efd8 (LU-17137 utils: Deprecate l_getidentity 'files' alias)
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: Ice0269d524e237a4fc421b2a91d8f26b5e41b13f
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54330
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 weeks agoLU-8130 nrs: for TBF nid handling using rhashtables 93/54193/7
James Simmons [Wed, 27 Mar 2024 17:21:37 +0000 (11:21 -0600)]
LU-8130 nrs: for TBF nid handling using rhashtables

While looking at the nrs code for lnet_nid_t I saw TBF was not
using struct lnet_nid. For the first step to support large NIDs
I moved the current use of cfs_hash to rhashtables. This doesn't
complete IPv6 support but its a first step since the rhashtable
can use large NIDs. Next step will be updating tr_nids handling.

With this port I found the refcount handling to be incorrect.
Before this work I saw in the debug logs

Busy TBF object from client with NID 0@lo, with -1073741824 refs

and nrs_tbf_res_put() never cleans up struct nrs_tbf_client until
the filesystem is unmounted. With this patch we do cleanup
each nrs_tbf_client after we are done with policy. With this being
the case nrs_tbf_nid_hop_exit() should be called unless something
is wrong.

Test-Parameters: trivial testlist=sanityn
Change-Id: Iab69a16c12ed89f0694af7bcfe9158f468838ca4
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54193
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 weeks agoLU-17422 clio: use page pools for UDIO/hybrid 70/53670/17
Patrick Farrell [Wed, 27 Mar 2024 21:45:02 +0000 (17:45 -0400)]
LU-17422 clio: use page pools for UDIO/hybrid

This moves unaligned/hybrid IO to using page pools.  This
reduces the time spent in memory allocation while doing IO
to near zero, at least in simple tests.

This should close most of the performance gap between
udio/hybrid and regular DIO for reads, taking them from
~13 GiB/s to close to 20 GiB/s.  This should also scale as
DIO performance improves.

The improvement for writes is much more limited, because
UDIO writes do not have parallel data copy yet.  This will
improve UDIO write performance by perhaps 10-20%, so from
~2.5 GiB/s to ~3.0 GiB/s, very roughly.

Signed-off-by: Patrick Farrell <patrick.farrell@oracle.com>
Change-Id: I0cb8b5881bf2885a926383291f67fa252b56574f
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53670
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
4 weeks agoLU-17422 osc: Clear PageChecked on bounce pages 65/53865/10
Patrick Farrell [Wed, 27 Mar 2024 21:44:47 +0000 (17:44 -0400)]
LU-17422 osc: Clear PageChecked on bounce pages

When we're finalizing a bounce page, we must clear
PageChecked.  Otherwise, if it's a page pool page, it will
be reused without the full wipe the kernel gives it, and we
will see PageChecked on pages which are not actually from
encryption and will handle them incorrectly.

Fixes: f3fe144b85 ("LU-15003 sec: use enc pool for bounce pages")
Signed-off-by: Patrick Farrell <patrick.farrell@oracle.com>
Change-Id: I8b319e7ba55dd883d74db79a19bf93b6f125616a
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53865
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
4 weeks agoLU-17422 obdclass: rename sptlrpc pool and move init 69/53669/14
Patrick Farrell [Wed, 27 Mar 2024 21:44:15 +0000 (17:44 -0400)]
LU-17422 obdclass: rename sptlrpc pool and move init

This patch completes the move of the pools code to obd by
renaming the sptlrpc pool to obd, and moves the pool init
and cleanup to obd.

Signed-off-by: Patrick Farrell <patrick.farrell@oracle.com>
Change-Id: I9164601745c8faf19559216f55ea5df4e2e226fe
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53669
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
4 weeks agoLU-17422 obdclass: rename ptlrpc_page_pool 68/53668/14
Patrick Farrell [Wed, 27 Mar 2024 21:41:21 +0000 (17:41 -0400)]
LU-17422 obdclass: rename ptlrpc_page_pool

This patch renames the ptlrpc page pool to reflect its new
place in obd.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <patrick.farrell@oracle.com>
Change-Id: I67aa5f3eef26b5fb890e62bced837bea9dd032c6
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53668
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
4 weeks agoLU-17422 obdclass: move page pools to obdclass 67/53667/14
Patrick Farrell [Wed, 27 Mar 2024 21:34:09 +0000 (17:34 -0400)]
LU-17422 obdclass: move page pools to obdclass

This patch starts the process of moving page pools to
obdclass by moving the file and making the changes necessary
to compile and run Lustre with the file moved.

This does not rename anything in the file yet, that will be
done in subsequent patches.

Signed-off-by: Patrick Farrell <patrick.farrell@oracle.com>
Change-Id: Iff39dd9ddfb105773f8eafa4754d32189067189b
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53667
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>