Whamcloud - gitweb
Chris Hunter [Thu, 6 Jun 2024 05:44:12 +0000 (01:44 -0400)]
LU-17899 gss: improved systemd unit file for SSK daemon
Add operation ordering to lsvcgss initscript/service unit
so it starts after systemd network services are running.
Lustre-change: https://review.whamcloud.com/55379
Lustre-commit: TBD (from
cc08ebd0fb8f370451408c57b86001323b4da4dc)
Signed-off-by: Chris Hunter <chunter@ddn.com>
Change-Id: Iad39d01aae16732ff646383814033d6efb34af5e
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55339
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Jian Yu [Fri, 7 Jun 2024 17:44:11 +0000 (10:44 -0700)]
LU-17404 kernel: update RHEL 9.4 [5.14.0-427.20.1.el9_4]
Update RHEL 9.4 kernel to 5.14.0-427.20.1.el9_4 for Lustre client.
Lustre-change: https://review.whamcloud.com/54712
Lustre-commit: TBD (from
527a21ce444b46034e45de185a3bd39727353abb)
Test-Parameters: trivial \
mdtcount=4 mdscount=2 clientdistro=el9.4 testlist=sanity
Test-Parameters: optional clientdistro=el9.4 testgroup=full-part-1
Test-Parameters: optional clientdistro=el9.4 testgroup=full-part-2
Test-Parameters: optional clientdistro=el9.4 testgroup=full-part-3
Change-Id: Ieee88a5a9f8e58f8445e126d21e45228e7b5ca64
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55367
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Frederick Dilger [Sat, 25 May 2024 23:23:20 +0000 (19:23 -0400)]
LU-17343 utils: added --path option for lctl list_param
Added 'lctl list_param [-p] PARAM' option that prints the
actual pathname(s) for PARAM instead of the parameter names(s).
This should allow users to "resolve" PARAM pathnames so that they
can be used directly, which avoids having to hard code them. Also
renamed "po_only_path" and "po_show_path" to be "po_only_name" and
"po_show_name" to avoid confusion with "po_only_pathname" for the new
option.
Lustre-change: https://review.whamcloud.com/55202
Lustre-commit:
e1a9d08351721d280faed51a2061e3e16f25a6b2
Signed-off-by: Frederick Dilger <fdilger@whamcloud.com>
Change-Id: I2259b930f3ac5cc46ac7a9a36218a44fa110157c
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55331
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Andreas Dilger [Thu, 28 Mar 2024 03:18:56 +0000 (21:18 -0600)]
LU-16500 utils: 'lfs migrate' should select new OSTs
When migrating a file using "lfs migrate FILE" without any arguments
to specify a new layout, this should migrate the file to the best
OSTs available at that time based on free space, instead of keeping
the file on the same OSTs (which is almost pointless otherwise).
Reset the starting OST index for all components of the copied file
layout so that this can happen properly. Previously, only the last
component had the OST index reset, which was only partly helpful.
Add llapi_layout_ost_index_reset() to handle this, since it seems
likely that tools using llapi_layout_from_fd() and friends to copy
an existing layout will want to do the same. Add the corresponding
man page and reference it from llapi_layout_get_from_fd().
Update sanity test_56xe to check that the starting OST index of each
component is not the same for all components. This check might not
catch a broken "lfs migrate" every time since even before this patch
the last component would be allocated on a random OST, but will still
fail about once every 1/$OST_COUNT runs. Conversely, with this patch
it passes hundreds of iterations without a false positive, though a
small chance exists that it will have a false positive on occasion.
Add a "make utils" target to simplify building only user utilities.
Lustre-change: https://review.whamcloud.com/54600
Lustre-commit:
2007ab4709acaef0397df15c9f4cf4387844ba9c
Test-Parameters: testlist=sanity env=ONLY=56xe,ONLY_REPEAT=100
Fixes:
0568f4ca25 ("LU-16500 utils: set default ost index for lfs migrate")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ie4c68d4b2ff09560a7a13ae464723745cf968d36
Reviewed-by: Stephane Thiell <sthiell@stanford.edu>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55369
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Etienne AUJAMES [Tue, 12 Sep 2023 16:06:25 +0000 (18:06 +0200)]
LU-17110 llite: fix slab corruption with fm_extent_count=0
If userspace uses fiemap with .fm_extent_count=0, .fm_extents[0] is
not allocated. Writing on the first entry without checking the extent
count could lead to memory corruption (slab).
This patch fix also the case when osc is disable: FIEMAP_EXTENT_LAST
should be set on the extent (fe_flags) and not on the fiemap struct.
Add a regression test sanityn 71d to test fiemap with
fm_extent_count=0.
Add a regression test sanity-hsm 408 to test fiemap on release files.
Lustre-change: https://review.whamcloud.com/52352
Lustre-commit:
a81dc7d0e158894e905ab3d309f7b92864a94378
Fixes: 4097196 ("LU-11848 lov: FIEMAP support for PFL and FLR file")
Test-Parameters:testlist=sanityn env=ONLY=71d,ONLY_REPEAT=20
Test-Parameters:testlist=sanity-hsm env=ONLY=408,ONLY_REPEAT=20
Signed-off-by: Etienne AUJAMES <etienne.aujames@cea.fr>
Change-Id: Id63c6973540187e678020977f2d555dfcbf3c634
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55363
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Andrew Perepechko [Mon, 16 Jan 2023 13:13:34 +0000 (08:13 -0500)]
LU-16480 lov: fiemap improperly handles fm_extent_count=0
FIEMAP calls with fm_extent_count=0 are supposed only to
return the number of extents.
lov_object_fiemap() attempts to initialize stripe_last
based on fiemap->fm_extents[0] which is not initialized
in userspace and not even allocated in kernelspace.
Eventually, the call exits with -EINVAL and "FIEMAP does
not init start entry" kernel log message.
Lustre-change: https://review.whamcloud.com/49645
Lustre-commit:
829af7b029d8e4e391b93792bf5214611b0193bd
Fixes:
409719608c ("LU-11848 lov: FIEMAP support for PFL and FLR file")
Signed-off-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Change-Id: I65e706b5dd5c8a6db90a539c2602af839b4da823
HPE-bug-id: LUS-11443
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55362
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Sebastien Buisson [Mon, 3 Jun 2024 11:52:20 +0000 (13:52 +0200)]
LU-17899 gss: lsvcgss service fix
The lsvcgss service can fail to start if the daemon is invoked with
the '-k' option whereas no proper Kerberos configuration is in place
on the server. The daemon should ignore the '-k' option is such case
and try to start the other provided modes if any (SSK, Null).
And in case the daemon is started with the '-s' option (SSK), it
spawns a temporary additional thread to compute the number of rounds
used for Miller-Rabin prime testing. So the lsvcgss_sysd script should
support that.
Lustre-change: https://review.whamcloud.com/55293
Lustre-commit:
f28a7a33a8254fc25c8cb348f87a0c133286393f
Fixes:
ac1ea2ef12 ("LU-17741 gss: fix lsvcgss service for systemd")
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Iba632bd0ea9696ccea52bff5982a4d4e490597a7
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55294
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Jian Yu [Fri, 7 Jun 2024 09:04:27 +0000 (02:04 -0700)]
LU-17402 kernel: update RHEL 8.10 [4.18.0-553.5.1.el8_10]
Update RHEL 8.10 kernel to 4.18.0-553.5.1.el8_10.
Lustre-change: https://review.whamcloud.com/55350
Lustre-commit: TBD (from
66e63642f81f4d3059fa1969b9e510d172c374d0)
Test-Parameters: trivial fstype=ldiskfs mdtcount=4 mdscount=2 \
clientdistro=el8.10 serverdistro=el8.8 testlist=sanity
Test-Parameters: optional clientdistro=el8.10 testgroup=full-part-1
Test-Parameters: optional clientdistro=el8.10 testgroup=full-part-2
Test-Parameters: optional clientdistro=el8.10 testgroup=full-part-3
Change-Id: Iad6dc4f6294beeed1db44d8484b325a771bc1ad4
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55353
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Sebastien Buisson [Tue, 9 Apr 2024 13:00:41 +0000 (15:00 +0200)]
LU-17718 obdclass: potential string overflow upcall_cache.c
Use strncpy() in upcall_cache_set_upcall() to quiet Coverity warning.
And reorganize the function so that the code flow is more linear in
the success case.
CoverityID: 424705: ("String overflow")
Lustre-change: https://review.whamcloud.com/54710
Lustre-commit:
7869bb320e735547410a7d3e31061b9044389c53
Fixes:
a462a119ec ("LU-17497 obdclass: check upcall incorrect values")
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I1aee77f78c92c6c571dfe358435a2733cc3ba9d9
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55314
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Lai Siyao [Sun, 19 May 2024 00:38:33 +0000 (20:38 -0400)]
EX-9875 test: limit dir restripe overstripe count
Lack of LU-15527 code, the distributed transactions are slow. To
avoid test timeout, limit overstripe count and increase timeout for
sanity test_300ud and test_300ue.
Test-Parameters: trivial
Test-Parameters: mdtcount=4 testlist=sanity env=ONLY="300ud 300ue"
Test-Parameters: mdtcount=4 testlist=sanity env=ONLY="300ud 300ue"
Test-Parameters: mdtcount=4 testlist=sanity env=ONLY="300ud 300ue"
Test-Parameters: mdtcount=4 testlist=sanity env=ONLY="300ud 300ue"
Test-Parameters: mdtcount=4 testlist=sanity env=ONLY="300ud 300ue"
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I830ac27e446f3841147be4777ba06cdb8e1a7f59
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55347
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Andreas Dilger [Mon, 10 Jun 2024 03:11:30 +0000 (21:11 -0600)]
EX-9125 tests: exclude sanity-compr/1008 on Ubuntu
This subtest is failing consistently on Ubuntu. Disable until it
can be fixed.
Test-Parameters: trivial testlist=sanity-compr env=ONLY=1008,HONOR_EXCEPT=y clientdistro=ubuntu2204
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ie71462e7f033be914523ca96b22478a53b81b882
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55374
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Andreas Dilger [Fri, 7 Jun 2024 07:34:52 +0000 (01:34 -0600)]
RM-620 build: New tag 2.14.0-ddn152
New tag 2.14.0-ddn152
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: If85c2c311326aae73f27e15c7fa1358c393a509c
Jian Yu [Thu, 6 Jun 2024 07:34:00 +0000 (00:34 -0700)]
EX-9125 tests: change source dir for sanity-compr/1008
While the source directory contains almost all small files,
sanity-compr test_1008 will hit the
"failed estimates > 50% of total estimates" failure on SLES15.
The patch fixes the issue by changing the source dir to
contain large files.
Test-Parameters: trivial clientdistro=sles15sp5 \
testlist=sanity-compr env=ONLY="1008",ONLY_REPEAT=3
Test-Parameters: trivial clientdistro=el9.3 \
testlist=sanity-compr env=ONLY="1008",ONLY_REPEAT=3
Test-Parameters: trivial clientdistro=el8.8 \
testlist=sanity-compr env=ONLY="1008",ONLY_REPEAT=3
Change-Id: Iad661750cba7c9d2204f2306e73169deb012ddf4
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55334
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Mikhail Pershin [Thu, 6 Jun 2024 11:12:00 +0000 (14:12 +0300)]
LU-15644 llog: don't report warning in no error case
Fix wrong check which includes rc == 0 valid case wronly
Fixes:
53d946a1222 (LU-15644 llog: don't replace llog error with -ENOTDIR)
Test-Parameters: trivial
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: Id6e7b2cd42b4769765c67d418552a13f048ea050
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55337
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Vitaliy Kuznetsov [Wed, 29 May 2024 12:26:29 +0000 (14:26 +0200)]
EX-9121 lipe: Add statistics merging for directories
This patch adds the ability to merge statistics for directories.
This is the first of two patches and contains the basic collection of
information from json as well as basic output.
Test-Parameters: trivial testlist=sanity-lipe-scan3,sanity-lipe-find3
Signed-off-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Change-Id: I253f7606b66921eddf52709931cc1c880e66a997
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55233
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexandre Ioffe <aioffe@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Bobi Jam [Fri, 24 Nov 2023 09:26:10 +0000 (17:26 +0800)]
EX-8355 csdc: stop compressing incompressible file data
The reduced_ratio (original_size/compress_reduced_size) represents
the minimum fraction of pages that are compressed out of each chunk,
namely the compressed chunk needs to shrink by at least
1/reduced_ratio blocks for it to be "compressible".
Let size compression_ratio be defined as
original_size/after_compression_size, so
reduced_ratio = compression_ratio / (compression_ratio - 1)
and we set its default value to 16, equivalent to 1.07 of compression
ratio (i.e. needs to shrink at least one 4KB block out of each 64KB
chunk).
After every compress_check_bytes of data being compressed, file's
compressibility would be re-calculated based on average
compress_reduced and average compress_orig data size.
Stop compressing file data if it is deemed to be incompressible, and
after compress_skip_bytes data have been written uncompressed , retry
the file compressibility check.
compress_reduced_ratio, compress_check_bytes, compress_skip_bytes
are tunable parameters:
osc.*.compress_reduced_ratio
osc.*.compress_check_bytes
osc.*.compress_skip_bytes
their default values are 16, 1M and 32M respectively.
Test-Parameters: testlist=sanity-compr
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I4ce3d752c67f18ba7b100c72a2bb61a91258c6e8
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53306
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Andriy Skulysh [Wed, 3 Apr 2024 10:34:32 +0000 (13:34 +0300)]
LU-17871 ldlm: FLOCK ownlocks may be not set
Conflict checking loop should continue until ownlocks is set.
Ownlocks variable is essential for lock merges.
Lustre-change: https://review.whamcloud.com/55184
Lustre-commit:
ede8d928d6c47551371512c80dfa4f159260e7e2
Fixes:
b07a57027e (LU-15402 ldlm: speedup RD flock enqueue)
Signed-off-by: Sergey Cheremencev <scherementsev@ddn.com>
Signed-off-by: Andriy Skulysh <andriy.skulysh@hpe.com>
Change-Id: Ied526581dd7d4f100c95f2fe582d117a87a8a584
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55246
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Andreas Dilger [Thu, 6 Jun 2024 08:37:21 +0000 (02:37 -0600)]
RM-620 build: New tag 2.14.0-ddn151
New tag 2.14.0-ddn151
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I254278cbf7ac546de7ce6005d5e9a35cb0952556
Andreas Dilger [Thu, 6 Jun 2024 08:37:00 +0000 (02:37 -0600)]
RM-620 build: New tag lipe-2.52
New tag lipe-2.52
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I5939742a929cc91e92e1608592d1d2801d1fef4a
Hongchao Zhang [Fri, 2 Feb 2024 05:58:59 +0000 (13:58 +0800)]
LU-14535 quota: get all quota info in LFS
This patch adds option "-a" for LFS to get the quota info of
all quota IDs. it iterates quota setting saved in global quota
setting files "quota_master/md-0x0" and "quota_master/dt-0x0"
from QMT and iterates the quota usage info saved in acct quota
files in the backend FS (LDiskFS or ZFS) from QSDs, then merge
the two kinds of quota info at client and print it in the similar
way as "lfs quota -u|-g|-p".
$lfs quota -a -u /mnt/lustre
Filesystem /mnt/lustre, Disk usr quotas
quota_id kbytes quota limit grace files quota limit grace
root 9684 0 0 - 1019 0 0 -
bin 4 0 102400 - 1 0 10240 -
daemon 4 0 102400 - 1 0 10240 -
adm 4 0 102400 - 1 0 10240 -
lp 4 0 102400 - 1 0 10240 -
sync 4 0 102400 - 1 0 10240 -
shutdown 4 0 102400 - 1 0 10240 -
halt 4 0 102400 - 1 0 10240 -
mail 4 0 102400 - 1 0 10240 -
$lfs quota -a -g /mnt/lustre
Filesystem /mnt/lustre, Disk grp quotas
quota_id kbytes quota limit grace files quota limit grace
root 9684 0 0 - 1019 0 0 -
bin 4 0 204800 - 1 0 20480 -
daemon 4 0 204800 - 1 0 20480 -
adm 4 0 204800 - 1 0 20480 -
lp 4 0 204800 - 1 0 20480 -
sync 4 0 204800 - 1 0 20480 -
shutdown 4 0 204800 - 1 0 20480 -
halt 4 0 204800 - 1 0 20480 -
mail 4 0 204800 - 1 0 20480 -
Lustre-change: https://review.whamcloud.com/42098
Lustre-commit:
3edc71803af3b4dc672313cd1ba395de724fbc59
Test-Parameters: testlist=sanity-quota env=SLOW=yes,ONLY=49,NUM_QIDS=20000
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Change-Id: I08feb928fbf34635ec9c5c341de993c718798dc9
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/46328
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Jian Yu [Thu, 30 May 2024 01:20:36 +0000 (18:20 -0700)]
LU-17750 kernel: update SLES15 SP4 [5.14.21-150400.24.100.2]
Update SLES15 SP4 kernel to 5.14.21-150400.24.100.2 for Lustre client.
Lustre-change: https://review.whamcloud.com/54823
Lustre-commit: TBD (from
0406b98b5178074c86710262f33d9315d6306116)
Test-Parameters: trivial
Change-Id: I401e97f602e6c8c62fac73e3603eb0226745bba1
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55206
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Artem Blagodarenko [Mon, 3 Jun 2024 10:57:33 +0000 (06:57 -0400)]
EX-9878 csdc: is_chunk_start should return header copy
In is_chunk_start()
*ret_header = header;
...
kunmap_atomic(header);
ret_header is used after is_chunk_start(). The header
copy should be returned from is_chunk_start() for safe work.
Test-Parameters: testlist=sanity-compr
Signed-off-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Change-Id: Ib5e828d6b61e90dcd70c28589931a4490cf19c22
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55292
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Bobi Jam [Thu, 30 May 2024 12:58:57 +0000 (20:58 +0800)]
EX-9823 osc: clear oi_write_osclock in lock fini func
Move osc_io::oi_write_osclock clearance in osc_lock_fini() as
it's set in osc_lock_init().
Compression IO could possibly expand lock region and
osc_lock_set_writer() could access a osc_io that is not accessed
in osc_io_iter_init(), so that osc_io_rw_iter_fini() miss clearing
osc_io's oi_writer_osclock.
This patch moves the oi_write_osclock clearance in lock fini function
to match its creation in osc_lock_init().
Test-Parameters: testlist=sanity-compr env=COMPR_EXTRA_LAYOUT="-E 1M -c 1 -E eof -c 4 -Z lz4:3"
Test-Parameters: testlist=sanity-compr env=COMPR_EXTRA_LAYOUT="-E 1M -c 1 -E eof -Z lz4:3"
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Ied42f5befc1abd76aa10a7666eadb9a58e1f1783
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55261
Reviewed-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Alexandre Ioffe [Tue, 4 Jun 2024 23:25:19 +0000 (16:25 -0700)]
EX-9867 test: unlimit expected number of keepalive msgs
Sometimes test_165g may be internally delayed and
the number of keepalive messages from ofd_access_log_reader
may be unexpectably big.
To fix, remove the verified upper boundary of the keepalive
message counter and make test_165g to expect unlimited number
of such messages.
Test-Parameters: trivial testlist=sanity env=ONLY="165g",ONLY_REPEAT=20
Signed-off-by: Alexandre Ioffe <aioffe@ddn.com>
Change-Id: I8afcfd3c3e52fda229ef81491259bdc600947bd3
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55312
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Sebastien Buisson [Wed, 5 Jun 2024 13:50:41 +0000 (15:50 +0200)]
LU-17000 gss: update init_channel initialization
Only root needs write access to 'sptlrpc.gss.init_channel', so adjust
permissions accordingly when sysfs file is created.
Lustre-change: https://review.whamcloud.com/55322
Lustre-commit: TBD (from
44c147a3bdf8d44ef3e36c86018bacacec542341)
DDN-Bug-Id: EX-9705
Test-Parameters: kerberos=true testlist=sanity-krb5
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I6539ade1a9d815664f6659a5c1ee25e7f1f7df0e
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55320
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Alex Zhuravlev [Tue, 4 Jun 2024 17:59:09 +0000 (20:59 +0300)]
EX-9873 obdclass: reset bits after decompression
as uncompressed data can be less than chunk/page, but still be
visiable to userspace as a part of a sparse file.
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I4114b0704fb685013f4e03cf2d80ccde2cc8c87f
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55308
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Bobi Jam [Sat, 1 Jun 2024 18:29:57 +0000 (11:29 -0700)]
LU-17773 lov: avoid partly outside array bounds build error
Avoid "array subscript 'struct lov_stripe_md_entry[0]’ is partly
outside array bounds of ‘struct lov_stripe_md_entry[0]’ error.
Otherwise an lsme holder will be allocated for invalid lmm magic.
Lustre-change: https://review.whamcloud.com/54944
Lustre-commit: TBD (from
2859950cc91df34ddaf0a45f5f37fa13faf99a5d)
Fixes:
902fe290 ("LU-17261 lov: ignore broken components")
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I5a403a0d230d2129e372fd8a22f58901cd0c1b68
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54868
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Alex Zhuravlev [Fri, 31 May 2024 08:54:10 +0000 (11:54 +0300)]
EX-9871 tests: skip sanity-compr 1007 and 1008
if needed tools are not installed
Test-Parameters: trivial
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I0dc3d44c300708f3a25bfce06b81993cdd30c418
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55273
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Sergey Cheremencev <scherementsev@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Alexandre Ioffe [Sat, 11 May 2024 01:28:05 +0000 (18:28 -0700)]
LU-17646 llapi: lustreapi: add FID in error messages
Use llapi_fd2fid() to print FID in llapi_lease_set() and
llapi_lease_check() error messages.
Lustre-change: https://review.whamcloud.com/55074
Lustre-commit:
8920e024cbc5d7db094f06e757e07c50524928e6
Test-Parameters: trivial
Signed-off-by: Alexandre Ioffe <aioffe@ddn.com>
Change-Id: Iac97ea721860652e304c674007ac7646d183e2fd
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55237
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Alexandre Ioffe [Thu, 30 May 2024 02:45:31 +0000 (19:45 -0700)]
EX-9280 lipe: extend periodic stats in lpurge
In lpurge added periodic stats:
- Size and number of files which are not purged due to
- stale
- not mirrored
- Number of inodes total and used
These stats are refreshed with each purge cycle. For example:
testfs-OST0000: INFO: used_kb: 179564 (3%) total_kb: 5496292
used_inodes: 301 (0%) total_inodes: 375360
testfs-OST0000: INFO: purged: 1 (20480KB 0%) failed_del: 0 (0KB 0%)
stale: 0 (0KB 0%) nomirror: 2 (178176KB 3%)
Test-Parameters: trivial testlist=hot-pools
Signed-off-by: Alexandre Ioffe <aioffe@ddn.com>
Change-Id: Ib404afe2b9d636bf1deaf8948411616971443932
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55248
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Qian Yingjin [Thu, 23 May 2024 02:44:49 +0000 (22:44 -0400)]
LU-17866 pcc: zero ra_pages explictly for a file after PCC mmap
To support mmap under PCC, we do some special magic with mmap to
allow Lustre and PCC to share the page mapping.
The mapping host (@mapping->host) for the Lustre file is replaced
with the PCC copy for mmap. This may result in the wrong setting
of @ra_pages for the Lustre file handle with the backing store of
the PCC copy in the kernel:
->do_dentry_open()->file_ra_state_init():
file_ra_state_init(struct file_ra_state *ra,
struct address_space *mapping)
{
ra->ra_pages = inode_to_bdi(mapping->host)->ra_pages;
ra->prev_pos = -1;
}
Setting readahead pages for a file handle is the last step of the
open() call and it is not under the control inside the Lustre file
system.
Thus, to avoid setting @ra_pages wrongly we set @ra_pages with
zero for Lustre file handle explictly in all read I/O path.
When invalidate a PCC copy, we will switch back the mapping
between Lustre and PCC. We also set mapping->a_ops back with
@ll_aops.
The readahead path in PCC backend may enter the ->readpage() in
Lustre. Then we check whethter the file handle is a Lustre file
handle. If not, it should be from mmap readahead I/O path of the
PCC copy and return error code directly in this case.
Change-Id: Id1e4a9e47bb484e97053759e1743fd2fce040149
Test-Parameters: clientdistro=el8.9 testlist=sanity-pcc env=ONLY=97,ONLY_REPEAT=10
Test-Parameters: clientdistro=el9.3 testlist=sanity-pcc env=ONLY=98,ONLY_REPEAT=10
Signed-off-by: Qian Yingjin <qian@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55181
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Andreas Dilger [Tue, 12 Apr 2022 23:18:10 +0000 (17:18 -0600)]
LU-15720 dne: add crush2 hash type
The original "crush" hash type has a significant error with files
that have all-number suffixes, or suffixes that have non-alpha
characters in them. These files will all be placed on the same
MDT as the base filename, which causes MDT imbalance.
Add a "crush2" hash type that has more stringent checks for the
suffix, so that it doesn't consider all-digit suffixes, or files
that only have a '.' at the right offset, as temporary files.
Test that the "broken" all-digit or extra-'.' filenames are hashed
properly with "crush2". We also need to confirm that the old "crush"
hash has not changed (for name lookup compatibility) and still has
the original "bad hashing" bug that puts all files on the same MDT.
Fix handling of types beyond MDT_HASH_TYPE_CRUSH when creating dirs.
Fix debug layout printing of hash_type in more parts of the code.
Don't flood console if hash type is unrecognized in the future.
Lustre-change: https://review.whamcloud.com/47015
Lustre-commit:
1ac4b9598ad6e2f94c4c672b4733186364255c6a
Lustre-change: https://review.whamcloud.com/48713
Lustre-commit:
e17471792388e59f44040d48dd8138ec865663af
Fixes:
0a1cf8da8069 ("LU-11025 dne: introduce new directory hash type 'crush'")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I1ce34b8f3af44432f55307ebc6906677c6179d1d
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54925
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Andreas Dilger [Thu, 30 May 2024 17:04:27 +0000 (11:04 -0600)]
EX-9708 utils: lfs setstripe adds -E with -Z
When specifying a layout with "lfs setstripe -Z" it will ignore
this option if no PFL component is specified with "-E".
Instead, "lfs setstripe -Z" should automatically upgrade the file
layout to a PFL layout so the compression parameters are saved.
Test-Parameters: trivial testlist=sanity-compr
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I29cc373fabd352d6f8b6781c238806b75cce7057
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55264
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Timothy Day [Tue, 9 Jan 2024 17:17:10 +0000 (17:17 +0000)]
LU-17242 debug: use dump_stack() where possible
In some cases, libcfs_debug_dumpstack() can fail to output a
stack trace - either because the needed symbols are not exported
or those symbols can't be resolved at runtime. This seems to
occur more often with newer kernels. The messages appears only
as:
Lustre: ldlm_cb01_002: service thread pid 57876 was inactive for
40.494 seconds. The thread might be hung, or it might only be
slow and will resume later. Dumping the stack trace for
debugging purposes:
Pid: 57876, comm: ldlm_cb01_002 6.1.70 #1 SMP PREEMPT_DYNAMIC
Thu Jan 4 18:52:41 UTC 2024
Call Trace TBD:
with no stack trace (seen on CentOS 8.5 with ml 6.1.70).
For reference, the runtime symbol lookup was added and updated in:
b49ce7a ("LU-12400 libcfs: save_stack_trace_tsk if ARCH_STACKWALK")
58ac9d3 ("LU-14099 build: Fix for unconfigured arch_stackwalk")
First, add a message when the symbol can't be resolved correctly.
This makes it much easier to understand why the stack trace is
missing.
Second, replace libcfs_debug_dumpstack(NULL) with dump_stack().
When the task_struct is NULL, libcfs uses the current
task_struct. This replicates the functionality of dump_stack().
Using dump_stack() is more reliable, more in line with kernel
style, and not likely to be un-exported in the future.
Finally, in lustre/osc/osc_object.c the stack isn't dumped since
there is already an LBUG().
There only remains one user of libcfs_debug_dumpstack() which
uses a task_struct other than current. This can be cleaned up
in a future patch.
Lustre-change: https://review.whamcloud.com/53625
Lustre-commit:
ecac0c175d934fd5624c9ad8db8f45dbc33fb56c
Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I196c1da7e39b1a694c0cb67ecfaab58ab3e4662c
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55239
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Alexander Zarochentsev [Mon, 29 Apr 2024 17:37:34 +0000 (17:37 +0000)]
LU-17851 ldiskfs: restart long fallocate tx
__ext4_journal_ensure_credits() may allow a long fs operation
like fallocate to run for too long, if the initial credits
estimation is enough high.
The fix is to force tx restart if tx state is not T_RUNNING.
Lustre-change: https://review.whamcloud.com/55111
Lustre-commit:
f317b5c30e478fdecceea4bd07c85ff305e9d81d
HPE-bug-id: LUS-12311
Signed-off-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Change-Id: Ib03d78739997caa6d13690b41ef7d01609a3623b
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55247
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Vitaly Fertman [Tue, 13 Jul 2021 16:07:14 +0000 (19:07 +0300)]
LU-14847 ptlrpc: two replay lock threads
conflict to each other what leads to:
ASSERTION( atomic_read(&imp->imp_replay_inflight) == 1 )
replay_lock_interpret() does ptlrpc_connect_import() on error, and one
thread will appear starting with connect reply interpret.
replay_lock_interpret() also wakes up ldlm_lock_replay_thread() which
does ptlrpc_import_recovery_state_machine().
It may happen that both threads will get to ldlm_replay_locks() on the
next round at the same time, both increment imp_replay_inflight and
the second one will assert.
The problem appeared in LU-13600 which added ldlm_lock_replay_thread()
with the ptlrpc_import_recovery_state_machine() call.
Lustre-change: https://review.whamcloud.com/44294
Lustre-commit:
d7d7eb50c8f5fd3fc5a7808fb112d233bdef34d7
HPE-bug-id: LUS-10147
Fixes:
3b613a442b ("LU-13600 ptlrpc: limit rate of lock replays")
Signed-off-by: Vitaly Fertman <vitaly.fertman@hpe.com>
Signed-off-by: Xing Huang <hxing@ddn.com>
Change-Id: Ia9aafb631e3ba5f850504cc58b4826acec2813bd
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55249
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Lai Siyao [Tue, 19 Dec 2023 08:24:07 +0000 (03:24 -0500)]
LU-9457 test: improve sanity 253
Improve sanity test_253: set high watermark to 50M, and fill OST with
fallocate.
Lustre-change: https://review.whamcloud.com/53548
Lustre-commit:
e934646f5ea87cd8a432db0e672c6ea48867ea47
Test-Parameters: trivial
Test-Parameters: testlist=sanity env=EXCEPT=77c
Test-Parameters: testlist=sanity env=EXCEPT=77c
Test-Parameters: testlist=sanity env=EXCEPT=77c
Test-Parameters: testlist=sanity env=EXCEPT=77c
Test-Parameters: testlist=sanity env=EXCEPT=77c
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I85139d7fc0697d08c21bdb19432b40c8dab82ee9
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55276
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Lai Siyao [Fri, 3 May 2024 00:27:04 +0000 (20:27 -0400)]
LU-15988 osp: don't print nid on -ESTALE
Osp_send_update_req() should not access import upon -ESTALE, because
this MDT may be in umount.
Lustre-change: https://review.whamcloud.com/55049
Lustre-commit:
ae26dbc3387a17b763cbc901fa256d894a1f88fb
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Ibd869e4e8da4f90ffd608a36d866264d5d552d0e
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55288
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Andreas Dilger [Thu, 16 May 2024 19:57:42 +0000 (21:57 +0200)]
LU-15496 tests: fix sanity/398c to use proper OSC name
For ppc64le and aarch64 clients, the OSC import instance name does
not have "ffff" at the start, so use the proper device name for this
subtest.
Clean up the rest of test_398c to meet modern test code style.
Also add debugging to sanity/398c from #53462.
Lustre-change: https://review.whamcloud.com/55132
Lustre-commit:
b1b57bcadeeb5a87ac75387c4aa4ae084e1a27e0
Lustre-change: https://review.whamcloud.com/53462
Lustre-commit:
304ca31e2aa15c576e468a86e45d8817c8eca391
Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: If8c72fa9b13eace009f39daf82454221eba6761b
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Alex Deiter
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55313
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Mikhail Pershin [Sat, 18 May 2024 19:43:05 +0000 (22:43 +0300)]
LU-15644 llog: don't replace llog error with -ENOTDIR
The dt_try_as_dir() contains check for object existence
which is reported as -ENOTDIR after all. In case of llog
that goes to upper level and cause error reporting to
console. It is not relevant neither by error code nor by
debug level
Patch skips check for object existence in case of llog,
it is excessive anyway.
Debug level is reduced as well to don't spawn console
messages in case of -ENOENT, -ESTALE or -EIO errors
Lustre-change: https://review.whamcloud.com/55151
Lustre-commit:
bd9839f7dbdf59751e7cdc234602eb338c518104
Fixes:
1ebc9ed460 ("LU-15902 obdclass: dt_try_as_dir() check dir exists")
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: Id404204566898a6ac2e258b7824491effc5fc92e
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55152
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Jian Yu [Thu, 30 May 2024 01:17:55 +0000 (18:17 -0700)]
LU-17883 kernel: update SLES15 SP5 [5.14.21-150500.55.65.1]
Update SLES15 SP5 kernel to 5.14.21-150500.55.65.1 for Lustre client.
Lustre-change: https://review.whamcloud.com/55227
Lustre-commit: TBD (from
1372c20c7d85c4d5c216c566647a883af1c5f16a)
Test-Parameters: trivial mdtcount=4 mdscount=2 \
clientdistro=sles15sp5 testlist=sanity
Test-Parameters: optional clientdistro=sles15sp5 testgroup=full-part-1
Test-Parameters: optional clientdistro=sles15sp5 testgroup=full-part-2
Test-Parameters: optional clientdistro=sles15sp5 testgroup=full-part-3
Change-Id: Ie0601c190e52d6192bf389338be51c77db03a9c2
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55229
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Jian Yu [Wed, 29 May 2024 00:40:19 +0000 (17:40 -0700)]
LU-17402 kernel: RHEL 8.10 client support
This patch makes changes to support RHEL 8.10 release
with kernel 4.18.0-553.el8_10 for Lustre client.
Lustre-change: https://review.whamcloud.com/54800
Lustre-commit: TBD (from
6748f47fac79e557ae21eb790b597be6449c926a)
Test-Parameters: trivial fstype=ldiskfs mdtcount=4 mdscount=2 \
clientdistro=el8.10 serverdistro=el8.8 testlist=sanity
Test-Parameters: trivial fstype=zfs mdtcount=4 mdscount=2 \
clientdistro=el8.10 serverdistro=el8.8 testlist=sanity
Test-Parameters: optional clientdistro=el8.10 testgroup=full-part-1
Test-Parameters: optional clientdistro=el8.10 testgroup=full-part-2
Test-Parameters: optional clientdistro=el8.10 testgroup=full-part-3
Change-Id: I0a9a262d13e0b0de3607da0982468fd8b5f6a7aa
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55207
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Jian Yu [Thu, 30 May 2024 23:00:40 +0000 (16:00 -0700)]
LU-17404 kernel: update RHEL 9.4 [5.14.0-427.18.1.el9_4]
Update RHEL 9.4 kernel to 5.14.0-427.18.1.el9_4 for Lustre client.
Lustre-change: https://review.whamcloud.com/55203
Lustre-commit: TBD (from
07a23833999207c336532bcf75aa9d5a954f1b07)
Test-Parameters: trivial \
mdtcount=4 mdscount=2 clientdistro=el9.4 testlist=sanity
Test-Parameters: optional clientdistro=el9.4 testgroup=full-part-1
Test-Parameters: optional clientdistro=el9.4 testgroup=full-part-2
Test-Parameters: optional clientdistro=el9.4 testgroup=full-part-3
Change-Id: If18027650ff953733f2e57727b71d2daa61d249c
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55208
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Elena Gryaznova [Tue, 26 Apr 2022 13:37:27 +0000 (16:37 +0300)]
LU-15785 tests: do not detect versions for RPC_MODE mode
lustre_version_code() is called each time when do_rpc_nodes()
is called. It is not needed to detect versions for RPC_MODE mode.
Lustre-change: https://review.whamcloud.com/47144
Lustre-change:
e3fcd81ae5f378ac62754a659c7adf0e0b656cf3
Fixes:
8fa23490bb ("LU-1538 tests: standardize test script init - sanity")
Signed-off-by: Elena Gryaznova <elena.gryaznova@hpe.com>
HPE-bug-id: LUS-10914
Change-Id: Ia7645de0a4eedfddf859c80e661ebcb2e45de140
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Vladimir Saveliev <vladimir.saveliev@hpe.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55272
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Andreas Dilger [Thu, 30 May 2024 00:45:45 +0000 (18:45 -0600)]
RM-620 build: New tag 2.14.0-ddn150
New tag 2.14.0-ddn150
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I7cac3d582c510f1e19316b97ccfe26dd239dce31
Andreas Dilger [Thu, 30 May 2024 00:45:22 +0000 (18:45 -0600)]
RM-620 build: New tag lipe-2.51
New tag lipe-2.51
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I814564f4535217c614ecc8bbda0ed842661ebf08
Etienne AUJAMES [Mon, 8 Jan 2024 15:06:08 +0000 (16:06 +0100)]
LU-17250 mgs: generate a new MDT configuration by copy
The configuration for a new MDT is generated by reading the client
configuration. The MGS filter existing mdc/osc, interpret the
records and then create the corresponding osp/osc device for the MDT.
The main idea of this patch is first to convert and copy the records
from the client configuration to create the new MDT.
And then, copy the remaining record sections from an existing MDT.
So the new MDT can inherit OST pools and parameters from the existing
one.
This avoids complex compatibility checks for IPv4/v6 NID because
add_uuid records are copied without need to parse NIDs.
This also allows to copy "add failnid" section from the client.
This patch extend the usage to "add failnid" section on MDT
configurations.
Here are the steps to copy a existing MDT configuration:
1/ read client configuration and generate osp MDT/OST records for the
new MDT
1/ find an existing MDT configuration
2/ copy and convert the remaining configuration records from the
existing MDT configuration (parameters and OST pools)
Add the regresion test conf-sanity 137.
Lustre-change: https://review.whamcloud.com/53614
Lustre-commit:
d4682ff4cc44413810a68e572cf7f05d5b188bb4
Test-Parameters: mdtcount=4 fstype=zfs testlist=conf-sanity
Test-Parameters: mdtcount=4 fstype=ldiskfs testlist=conf-sanity
Test-Parameters: mdtcount=4 fstype=zfs testlist=conf-sanity env=ONLY=137,ONLY_REPEAT=10
Test-Parameters: mdtcount=4 fstype=ldiskfs testlist=conf-sanity env=ONLY=137,ONLY_REPEAT=10
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Change-Id: I4a99085b8930a0dd8002bde87d4e8c575aaccba0
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55101
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Patrick Farrell [Fri, 15 Dec 2023 20:48:53 +0000 (15:48 -0500)]
LU-13805 llite: Fix return for non-queued aio
If an AIO fails or is completed synchronously (even
partially), the VFS will handle calling the completion
callback to finish the AIO, and so Lustre needs to return
the number of bytes successfully completed to the VFS.
This fixes a bug where if an AIO was racing with buffered
I/O, the AIO would fall back to buffered I/O, causing it to
complete before returning to the VFS rather than being
queued. In this case, Lustre would return 0 the VFS, and
the VFS would complete the AIO and report 0 bytes moved.
This fixes the logic for this.
Lustre-Commit:
8a5bb81f774b9d41f1009b07010372fa9cd03a62
Lustre-Change: https://review.whamcloud.com/49915
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I9306402201e2962bbff04a4264c37bd0f1eca7b7
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53696
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Andreas Dilger [Sat, 27 Apr 2024 02:48:15 +0000 (20:48 -0600)]
LU-17788 ptlrpc: restore watchdog revival message
Restore the "Service thread pid NNN completed after SSS.mmm
seconds. This likely indicates the system was overloaded"
message that was lost during ptlrpc watchdog restructuring.
Do not rate limit this message, so that it is possible to see
when all threads are restored, even if their corresponding
"Service thread pid NNN was inactive" message was throttled.
Update recovery-small test_10a to check for these messages,
so that they are not removed again in the future.
Lustre-change: https://review.whamcloud.com/54942
Lustre-commit:
20c09eff4d397e7158aa4408e0cb50b102cc61c0
Test-Parameters: testlist=recovery-small env=ONLY=10a
Fixes:
fc9de679a4 ("LU-9859 libcfs: add watchdog for ptlrpc service threads.")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I0c7e96fb7f73ca5562a6f5ad780a79ffc83ebbe5
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55095
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Vitaliy Kuznetsov [Tue, 21 May 2024 19:05:16 +0000 (21:05 +0200)]
EX-9585 lipe: add lipe_find3 pool option
Add an option to print the OST pool for a file with the
"-printf" argument, both as long option %{pool} as well as
short option and "%Lp" that is compatible with "lfs find".
The long %{pools} option prints *all* pools in the layout.
Update the lipe-find3.1 man page and add test cases for both.
Test-Parameters: trivial testlist=sanity-lipe-find3,sanity-lipe-scan3
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Change-Id: I18d2d3cc161c8aa92eb27c33b06214b6f53ebbe5
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54785
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexandre Ioffe <aioffe@ddn.com>
Vitaliy Kuznetsov [Wed, 29 May 2024 15:00:30 +0000 (17:00 +0200)]
EX-9121 lipe: Trivial improvements for report merging
Small changes that do not affect the functionality, but allow to
reuse some functions in other parts of lipe3, for example in the
utility for merging different directory stats reports.
Test-Parameters: trivial testlist=sanity-lipe-scan3,sanity-lipe-find3
Signed-off-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Change-Id: Ib7eeeccb651e7bcff4ddfc78c66a35793df7bd1d
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55232
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Etienne AUJAMES [Thu, 26 Oct 2023 19:28:55 +0000 (21:28 +0200)]
LU-16566 sptlrpc: remove rq_sepol from ptlrpc_request
This patch remove rq_sepol from ptlrpc_request to reduce the memory
consumption on the servers.
rq_sepol field is 327 bytes long allocated for each request and this
is rarely used (it needs SELinux activated with the send_sepol
feature).
The patch store the SELinux policy status string in a separate object.
The pointer is stored in ptlrpc_sec->ps_sepol and protected by RCU
(mostly read-only, the SELinux policy should rarely change).
When the policy status needs to be packed in a request, we take a
reference to the current ps_sepol object and release it after the
packing. If the policy has changed in the meantime, the object used
will be free after.
A read operation is added to srpc_sepol parameter to return the
SELinux policy string cached in Lustre.
Lustre-change: https://review.whamcloud.com/52845
Lustre-commit:
3f70481c93dcabbb30267608a0054f4d7092e0db
Test-Parameters: testlist=sanity-selinux env=ONLY=21,ONLY_REPEAT=50
Test-Parameters: testlist=sanity-selinux env=ONLY=21,ONLY_REPEAT=50
Signed-off-by: Etienne AUJAMES <etienne.aujames@cea.fr>
Change-Id: I80fb76c97885c4b2987eb7f91a9bfe6e0e6e6c70
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55211
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Andreas Dilger [Thu, 31 Aug 2023 20:50:56 +0000 (14:50 -0600)]
LU-17000 ptlrpc: fix string overflow warnings
Fix potential string overflow warnings in sptlrpc_flavor2name()
calling strncat() with the full size of the target buffer
instead of the *remaining* space in the target buffer.
Fix potential string overflow warning in sepol_seq_write_old()
and sepol_seq_write() potentially copying an unterminated string
from userspace via strncpy() and not terminating it afterward.
Since the maximum incoming parameter size is known in advance,
is reasonably small (~342 bytes), and is only used temporarily,
reorganize the code to avoid two buffer allocations and copies.
Use memcpy() to copy the string since its length is known, and
always add a NUL terminator to the string afterward.
Improvements to error messages and code style in these functions.
Addresses-Coverity: 199034 ("Out-of-bounds access")
Addresses-Coverity: 199063 ("Out-of-bounds access")
Addresses-Coverity: 199108 ("Out-of-bounds access")
Addresses-Coverity: 397374 ("String not null terminated")
Addresses-Coverity: 397394 ("String not null terminated")
Lustre-change: https://review.whamcloud.com/52210
Lustre-commit:
ff62700fa8ee717a71de13baec25f0d69640ae7c
Test-Parameters: trivial testlist=sanity-sec,sanity-selinux
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ia810ce9f07b663a90049bb78af21c06f0e3ebbe5
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55210
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Hongchao Zhang [Sat, 20 Apr 2024 06:31:51 +0000 (14:31 +0800)]
LU-17873 test: ignore WIFSIGNALED if rc is 0
Ignored the checking resulst of WIFSIGNALED if the return status
of the "lctl test_create" thread is zero.
Lustre-change: https://review.whamcloud.com/55194
Lustre-commit: TBD (from
d1000ae89065a6868d0dbbd5c752ff06299d36c4)
Test-Parameters: trivial envdefinitions=SLOW=yes,DEBUG_SIZE=64 mdtcount=1 \
testlist=mds-survey,mds-survey,mds-survey,mds-survey,mds-survey,mds-survey
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Change-Id: Ifc3727d48010c9f00f38baff9ff91b5cc3afce5c
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55185
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Andreas Dilger [Fri, 12 Apr 2024 01:18:28 +0000 (19:18 -0600)]
LU-16915 tests: improve distro type checking
Improve lustre_os_release() infrastructure to reduce redundant
code and make it easier to use.
Lustre-change: https://review.whamcloud.com/54790
Lustre-commit:
1ffbec13c0f745d0b9c6b91959b1afa52f99d63b
Test-Parameters: trivial
Fixes:
339b5e918f ("LU-16915 tests: except sanity-sec test_51")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Id02223752df4eb3fd3b62b339e8c417eb33ebbe5
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55213
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Andreas Dilger [Fri, 12 Apr 2024 01:18:28 +0000 (19:18 -0600)]
LU-16915 tests: except sanity-sec test_51
Skip sanity-sec test_51 since it has started failing recently with
the move to el9.3 servers.
Add common lustre_os_release infrastructure to make such checking
easier in the future.
Lustre-change: https://review.whamcloud.com/54751
Lustre-commit:
b881bd1051451ed18610e0cc3c3cd56c8803cbc9
Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Id02223752df4eb3fd3b62b339e8c417eb3e86a12
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55212
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Rebanta Mitra [Tue, 28 May 2024 00:17:43 +0000 (17:17 -0700)]
LU-17877 lnet: export REGISTER_FUNC with EXPORT_SYMBOL_GPL
This patch exports REGISTER_FUNC and UNREGISTER_FUNC
with EXPORT_SYMBOL_GPL to load GPL-licensed modules.
Lustre-change: https://review.whamcloud.com/55217
Lustre-commit: TBD (from
b3bdf8ba7fb316905b76decb35bab8dc1947ed91)
Test-Parameters: trivial
Signed-off-by: Rebanta Mitra <rmitra@nvidia.com>
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Change-Id: I3a0d4e2b27911af36e210692d28892590eb0371c
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55218
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Shaun Tancheff [Wed, 15 May 2024 06:30:39 +0000 (23:30 -0700)]
LU-17816 llapi: ensure pool name is nul terminated
strncpy() usage is inconsistent about the size of pool name
and sometimes for get to ensure a nul byte is placed at the
end of the copy.
CoverityID: 397181 ("Buffer not null terminated (BUFFER_SIZE)")
Also cleanup a case of checking that an unsigned value >= 0
CoverityID: 397820 ("Unsigned compared against 0 (NO_EFFECT)")
Lustre-change: https://review.whamcloud.com/55018
Lustre-commit:
64469274a4f3e202c76cf9a2757b8f36e8d0ee08
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Idec7adaf89c9dabc0275687c4a069fc8fa63e7a7
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55119
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Ake Sandgren [Wed, 15 May 2024 05:14:36 +0000 (22:14 -0700)]
LU-16819 build: use mofed path based on target kernel
Instead of using "uname -r", which limits builds to the currently
running kernel, use the target kernel which is available in
LINUXRELEASE, if the directory is available.
Building for a specific kernel is common practice when using DKMS.
Lustre-change: https://review.whamcloud.com/50937
Lustre-commit:
0e9708016b9948676484d290326c1fe8a269eb80
Test-Parameters: trivial
Signed-off-by: Ake Sandgren <ake.sandgren@hpc2n.umu.se>
Change-Id: Ifce912061a74fc5b7435cd940105190f0c3cd544
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55118
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Jian Yu [Mon, 20 May 2024 19:59:50 +0000 (12:59 -0700)]
LU-17749 kernel: update RHEL 8.9 [4.18.0-513.24.1.el8_9]
Update RHEL 8.9 kernel to 4.18.0-513.24.1.el8_9 for Lustre client.
Lustre-change: https://review.whamcloud.com/54821
Lustre-commit: TBD (from
23a99efd9104b328ce1edb5fc9094bce2c06e9b9)
Test-Parameters: trivial fstype=ldiskfs mdtcount=4 mdscount=2 \
clientdistro=el8.9 serverdistro=el8.8 testlist=sanity
Test-Parameters: trivial fstype=zfs mdtcount=4 mdscount=2 \
clientdistro=el8.9 serverdistro=el8.8 testlist=sanity
Test-Parameters: optional clientdistro=el8.9 serverdistro=el8.8 \
testgroup=full-part-1
Test-Parameters: optional clientdistro=el8.9 serverdistro=el8.8 \
testgroup=full-part-2
Test-Parameters: optional clientdistro=el8.9 serverdistro=el8.8 \
testgroup=full-part-3
Change-Id: I94b5a95e9e85f2f5e0cddb1dbb519ef92520ad0b
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55158
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Jian Yu [Sat, 11 May 2024 01:16:43 +0000 (18:16 -0700)]
LU-17404 kernel: new kernel [RHEL 9.4 5.14.0-427.16.1.el9_4]
This patch makes changes to support new RHEL 9.4 release
for Lustre client.
Lustre-change: https://review.whamcloud.com/54712
Lustre-commit: TBD (from
177846a0aa58b35d43696b3c3c5d71df0109ab14)
Test-Parameters: trivial \
mdtcount=4 mdscount=2 clientdistro=el9.4 testlist=sanity
Test-Parameters: optional clientdistro=el9.4 testgroup=full-part-1
Test-Parameters: optional clientdistro=el9.4 testgroup=full-part-2
Test-Parameters: optional clientdistro=el9.4 testgroup=full-part-3
Change-Id: Ic292c01ad16dc06e8dee966c4a211896fea284c0
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54746
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Cyril Bordage [Wed, 24 Apr 2024 02:21:53 +0000 (04:21 +0200)]
LU-14810 lnet: ongoing push when discovery is stopped
If a push is not completed when discovery thread is stopped, then we
still have ln_dc_handler used as md handler (from
lnet_peer_send_push). That leads to assert failure from
lnet_assert_handler_unused.
To fix that, we call lnet_assert_handler_unused only after the monitor
thread has been stopped. Thus, the patch for LU-17496 is not needed
anymore.
Lustre-change: https://review.whamcloud.com/54884
Lustre-commit:
3ba393a5cb21ff0f8bd8a09c341ee01e936321c7
Fixes:
36b14a23a6 ("LU-17207 lnet: race b/w monitor thr stop and discovery push")
Test-Parameters: testlist=sanity-lnet env=ONLY="212 220",ONLY_REPEAT=100
Signed-off-by: Cyril Bordage <cbordage@whamcloud.com>
Change-Id: I426c37b12a3d29327a7295f528a5b875a9ac88a0
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55167
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Qian Yingjin [Fri, 19 Apr 2024 02:53:10 +0000 (22:53 -0400)]
LU-17745 llite: fix the umount panic due to BDI unregister
There is a regression in the patch for LU-16954 on the old RHEL
kernel (RHEL8.2). When the Lustre is unmounted, the client gets
a crash.
In LU-16954, to avoid the remount failure, we explicitly
unregister the sysfs for the @bdi on the new kernel such as Unbutu
2204 v5.15 kernel.
However, this is not needed for the old kernel such RHEL 8.2.
In this patch, we remove the explicit unregister for the old kenel
to avoid the client crash during unmount.
Lustre-change: https://review.whamcloud.com/54850
Lustre-commit:
facff17860ff9a577bad0bf8fb932e869475e011
Fixes:
dcc1dd39a6 ("LU-16954 llite: add SB_I_CGROUPWB on super block for cgroup")
Test-Parameters: clientdistro=ubuntu2204 testlist=sanity-sec
Test-Parameters: clientdistro=el8.9 testlist=sanity-sec
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: Ic6df572744bed8994c08fb1369cc9beccbe2d87a
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55166
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Jian Yu [Wed, 15 May 2024 04:56:31 +0000 (21:56 -0700)]
LU-17850 build: prefer LINUXRELEASE over uname -r
In a container or chroot environment, "uname -r" reports
the host instead of the target kernel version. We should
use the LINUXRELEASE variable which is configured in
config/lustre-build-linux.m4 with the value from UTS_RELEASE.
Lustre-change: https://review.whamcloud.com/55108
Lustre-commit: TBD (from
c587c5bdf1c10e4b96e88bb3a0f1972a75dbe9cb)
Test-Parameters: trivial
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Change-Id: Iaa48027f5ae873e1298695a264db1c351d9eac5c
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55116
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Mikhail Pershin [Mon, 18 Mar 2024 15:37:02 +0000 (18:37 +0300)]
LU-17649 ptlrpc: fix -EACCES connection error handling
Connection errors -EACCES and -EROFS leave import in
intermediate state. It is still active as well as pinger
over it but has obd_no_recov set. That allows import to
recover after all if server security is updated. But even
in FULL state any RPC over import gets -ESHUTDOWN as
obd_no_recov is set
Meanwhile obd_no_recov is not supposed to be used in that
way, it reflects particular mount option and should not
be recovered ever. So patch sets import to deactive state
instead, making import not operational too but with
option to be activated manually or remounted
Server connections like LWP, MDT-OST and MDT-MDT are
excluded and are never deactivated. Such errors are
considered as temporary until remote target updates own
security as required or administrative intervention will
restart target as needed.
In both cases console message is issued.
Lustre-change: https://review.whamcloud.com/54448
Lustre-commit:
3f13f89e2f19b46a8f27ad007c10251147984875
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: Ib83e1b0ac541823ec236591f08145340d6f6bf04
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55224
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Yang Sheng [Mon, 13 May 2024 14:44:16 +0000 (22:44 +0800)]
LU-17847 sec: wake up for rsc entry
We should wake up the waiter after rsc do_upcall.
Otherwise it may be stuck for a long time.
Lustre-change: https://review.whamcloud.com/55094
Lustre-commit:
99b1a2b5df9cffeae68ec88dfe784881109386d8
Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: I87d1e5a9687056c8ee2428aad45dafda16247de2
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55222
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Sebastien Buisson [Mon, 8 Apr 2024 09:06:50 +0000 (11:06 +0200)]
LU-17714 gss: cleanup user keyring usage
User keys are linked to the user keyring. But we should not keep an
extra reference on the user keyring for every user key being created.
This leads to too many references on this keyring, and prevents proper
destroy in case the system wants to clean it up (because the user
logged off for instance).
And when unlinking a user key, we need to take care of the user
namespace, in order to fetch the real user keyring, and not the one
associated with the mapped uid in the user namespace.
Finally we must handle the case where the user key is explicitly
revoked via 'keyctl revoke' on the command line, by carrying out the
same cleanup as when 'lfs flushctx' is called. This properly drops
references on the key, and frees the security context associated with
the key.
Lustre-change: https://review.whamcloud.com/54692
Lustre-commit:
afe0e091d1b82391a929df74717b9665a6f0ab75
Test-Parameters: kerberos=true testlist=sanity-krb5
Fixes:
eef24d8a97 ("LU-17173 gss: user keys go to user keyring")
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Ic168b68f8652689aa4402eaa4fcdbd852743d320
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55170
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Sebastien Buisson [Mon, 8 Apr 2024 15:52:50 +0000 (17:52 +0200)]
LU-17714 gss: protect against revoked session keyring
In case the session keyring is revoked, request_key() still tries to
search it. Sadly this keyring is searched before the user keyring, so
it will return -EKEYREVOKED, and the user keyring, that does contain
the Lustre key, will not even be searched.
To work around this issue in the kernel implementation of request_key,
override the current process's credentials with no session keyring,
if we detect it has been revoked.
Lustre-change: https://review.whamcloud.com/54706
Lustre-commit:
045ab5c0273a843493ed2d6d3486b41efe36b834
Test-Parameters: kerberos=true testlist=sanity-krb5
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I64b6ac4693a47cf43d6fa1bf4e17bfb4907670fa
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55171
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Sebastien Buisson [Tue, 30 Jan 2024 12:13:52 +0000 (13:13 +0100)]
LU-17483 gss: refresh req context with already existing one
When we are processing a request with a root GSS context that
has the PTLRPC_CTX_ERROR_BIT bit set, try to replace it with an
already existing context. Such a context can already be up-to-date
thanks to other authentication requests sent to failover NIDs while
the current request was in the delay list. This valid context can be
fetched from the struct ptlrpc_sec.
Lustre-change: https://review.whamcloud.com/53859
Lustre-commit:
c76f7288fa772b48cf81050663e2124b25ab3994
Test-Parameters: kerberos=true testlist=sanity-krb5
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Iff1cf727c4579cba6456e010aac6537cf888b0ae
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55169
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Andreas Dilger [Fri, 17 May 2024 00:01:49 +0000 (18:01 -0600)]
RM-620 build: New tag 2.14.0-ddn149
New tag 2.14.0-ddn149
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I355196fa930dd63c414bb50c99359b6c2b1ebb32
Stephane Thiell [Thu, 11 Feb 2021 00:15:02 +0000 (16:15 -0800)]
LU-13609 mgs: fix config_log buffer handling
Fix buffer handling in mgs_list_logs() to list all MGS config_logs
using multiple ioctl calls when we have a large number of targets.
Lustre-change: https://review.whamcloud.com/41478
Lustre-commit:
e3f17defc141d8847562b610931255d37ed4dd3c
Fixes:
1d97a8b4cd3d ("LU-13609 llog: list all the log files correctly on MGS/MDT")
Signed-off-by: Stephane Thiell <sthiell@stanford.edu>
Change-Id: I1bf32e918e242f4da83c3d1624b7285a18a88d01
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55102
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Mr NeilBrown [Tue, 21 Mar 2023 23:08:29 +0000 (19:08 -0400)]
LU-10391 mgs: fix lots of white-space irregularities
In preparation for changing the code, fix lots of white-space issues
in mgs_llog.c
Lustre-change: https://review.whamcloud.com/50091
Lustre-commit:
60e6e35f4cad3f79b2e96ddf41a8d8a02d6047ac
Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I7fb40a473e3e4709778339b773988ec7079d20d8
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55100
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Arshad Hussain [Tue, 9 Jan 2024 06:12:57 +0000 (11:42 +0530)]
LU-16861 obdfilter: Exclude quotes when getting NIDs
In get_targets(), when getting NIDs the quotes were also included.
Exclude quotes when generating NIDs as they are not required.
Use $LCTL instead of $lctl, and make it also work in Janitor testing.
Lustre-change: https://review.whamcloud.com/53620
Lustre-commit:
c265e1c7b045bf1f9e5b2919c282b63086929ab6
Test-Parameters: trivial testlist=obdfilter-survey
Fixes:
9ef9906d7 ("LU-6863 tests: change obdfilter-survey.sh for CLIENTONLY mode")
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I8642539fc6b396f1339e20e4fef8bc78cda2d969
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55090
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Artem Blagodarenko [Thu, 16 May 2024 13:18:34 +0000 (09:18 -0400)]
EX-9784 csdc: Do not print error if a chunk is not compressed
is_chunk_start() can decide that a chunk can not be decompressed in
two cases: 1) chunk has not been compressed 2) chunk is corrupted
The error message should be printed only in case 2)
Signed-off-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Change-Id: I85f4850f989ba0fc8f00653f8f6b0f1b4837d625
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55128
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sergey Cheremencev <scherementsev@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Alex Zhuravlev [Tue, 26 Oct 2021 08:38:50 +0000 (11:38 +0300)]
LU-15163 osd: osd_obj_map_recover() to restart transaction
osd_obj_map_recover() stops transaction when need to call
vfs_link() and it has to start a new transaction to modify
filesystem.
Lustre-commit:
7bf0e557a2b3a463e4d78e81b6ab93987d3dc8af
Lustre-change: https://review.whamcloud.com/45368
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I6efe5444ddc959b19092bebc6e3c7dc25a29cea1
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55124
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Andreas Dilger [Tue, 14 May 2024 05:30:37 +0000 (07:30 +0200)]
RM-620 build: New tag 2.14.0-ddn148
New tag 2.14.0-ddn148
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I33b2d18f60a7ddbaeb20ac219fd361b13fc12de4
Alexandre Ioffe [Tue, 14 May 2024 00:25:19 +0000 (17:25 -0700)]
EX-9054 lipe: fix incorrect pool pointer usage
Use pointer to pool struct instead of pool name.
Fixes:
504b0b0b61 (EX-9054 lipe: Add SSH stats per agent)
Test-Parameters: trivial testlist=hot-pools
Signed-off-by: Alexandre Ioffe <aioffe@ddn.com>
Change-Id: I436aaa2ea9eb0059c5cee00882fe4332c6e22fe5
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55096
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Andreas Dilger [Mon, 13 May 2024 22:41:43 +0000 (00:41 +0200)]
RM-620 build: New tag 2.14.0-ddn147
New tag 2.14.0-ddn147
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I9f900fdfa9d48220e954ca6053e3d19cb60de9c5
Andreas Dilger [Mon, 13 May 2024 22:40:35 +0000 (00:40 +0200)]
RM-620 build: New tag lipe-2.50
New tag lipe-2.50
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I30e3b022099dac627c25482f4f883450544fceae
Elena Gryaznova [Tue, 11 Jan 2022 17:23:30 +0000 (20:23 +0300)]
LU-15429 tests: mount_mds_client() fix
mount/umount client is to be executed on active facet/host,
not on mds1_HOST. Without this fix test_140a() fails on
failover setup:
CMD: lm0101 umount /mnt/lustre2 2>&1
CMD: lm0102 rmdir /mnt/lustre2
lm0102: rmdir: failed to remove '/mnt/lustre2':
No such file or directory
test_140a: FAIL: no clients with recovery disabled
To reproduce the failure just run:
ONLY="107 140a" sh recovery-small.sh
on failover setup where mds1_HOST != mds1failover_HOST.
Lustre-commit:
1d2e2195873e82a603531e34f3f7d4c634490209
Lustre-change: https://review.whamcloud.com/46043
Fixes:
8bd04b4e57 ("LU-12722 target: disable recovery for local clients")
Test-Parameters: trivial env=ONLY="140a 140b" testlist=recovery-small
Signed-off-by: Elena Gryaznova <elena.gryaznova@hpe.com>
HPE-bug-id: LUS-10669
Change-Id: Ifbdedfda840e8421fa8a969f73131ca23982a28b
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55041
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Alexandre Ioffe [Wed, 7 Feb 2024 22:12:26 +0000 (14:12 -0800)]
EX-9054 lipe: Add SSH stats per agent
- Add stats counters on SSH per agent: error and
disconnection counters
- Add summary counter on total SSH inactivity
- Disconnect agent when lfs command option request fails
- Report log on INFO level when agent becomes active again
- Fixed minor bugs:
o memory leak when system error happens in
pthread_tryjoin_np()
o Missed stats on job retries
Test-Parameters: trivial testlist=hot-pools
Signed-off-by: Alexandre Ioffe <aioffe@ddn.com>
Change-Id: I35eebf61d35eb913a167ebd795779188a6217dac
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53957
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Lai Siyao [Thu, 25 Apr 2024 08:15:49 +0000 (04:15 -0400)]
LU-17756 lod: add tunable lod.*.max_stripes_per_mdt
Add a tunable lod.*.max_stripes_per_mdt for directory overstriping.
The default value is 1 for interoperation.
Add sanity 300uh 300ui.
Lustre-change: https://review.whamcloud.com/54945
Lustre-commit: TBD (from
90d013f8897df887e0eed90593f24751fca97f65)
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Id8199f01f5e2d62ead6bf43d239eee8ec1e4cbb5
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54947
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Patrick Farrell [Thu, 19 Jan 2023 20:05:38 +0000 (15:05 -0500)]
LU-12273 lod: metadata overstriping
This adds overstriping for MDTs, similar to overstriping
for OSTs (added in LU-9846). This adds a new option to
setdirstripe, -C, allowing creation of more than one stripe
per MDT. It is also possible to place multiple stripes on
the same MDT using specific striping with -m.
This allows a single directory to more fully use the full
capability of each MDT in the file system.
Two limitations of note:
1. This requires > 1 MDT, otherwise the DNE subsystem is
not initialized.
2. Due to recovery limitations, we allow a max of only 5
stripes per MDT.
MDT overstriping increases mdtest-hard-write performance by
up to 13%, mdtest-hard-stat by 93%, at the cost of a slight
drop in mdtest-hard-read (7%), with no change in delete.
4 MDTs, 1 stripe/MDT:
mdtest-hard-write 117.399467 kIOPS : time 339.496 seconds
mdtest-hard-stat 727.020749 kIOPS : time 55.666 seconds
mdtest-hard-read 245.556392 kIOPS : time 162.897 seconds
mdtest-hard-delete 104.379111 kIOPS : time 382.710 seconds
4 MDTs, 4 stripes/MDTs:
mdtest-hard-write 132.963290 kIOPS : time 309.093 seconds
mdtest-hard-stat 1408.161148 kIOPS : time 30.107 seconds
mdtest-hard-read 229.383910 kIOPS : time 179.576 seconds
mdtest-hard-delete 103.284369 kIOPS : time 398.442 seconds
Lustre-change: https://review.whamcloud.com/35034
Lustre-commit:
81ac7c0c989dd862e2215a4635c77e5123289658
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Signed-off-by: Qian Yingjin <qian@ddn.com>
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I11556b223029820bd335e87c7bf073970e03468d
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53570
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Alex Zhuravlev [Thu, 9 May 2024 17:34:29 +0000 (20:34 +0300)]
LU-17204 lod: don't panic on short LOVEA
when we request LOVEA and find the existing buffer is not enough,
we ask for LOVEA's size and reallocate the buffer. but LOVEA can
shrink in parallel (e.g. new default striping), so our expectation
that the size must be greater than size of the existing buffer is
not correct. replace the corresponding assertion with a simple
repeat + extra check for a livelock.
Lustre-commit:
8fa3532b1ee887be378adbf9432707b2d8a2d814
Lustre-change: https://review.whamcloud.com/52727
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I26ad5091228bf78858f8538478dbcbdb235cddf4
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55065
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Vitaliy Kuznetsov [Wed, 8 May 2024 11:54:03 +0000 (13:54 +0200)]
EX-9121 lipe: Add functionality for parsing ranges from JSON
This patch is the third in a series of patches that implement
functionality for combining size statistics reports and includes
functionality for reading and saving the resulting ranges in tables
obtained from different reports in JSON format.
Only affects file size statistics and this patch is the final one
for reports on file size statistics.
Test-Parameters: trivial testlist=sanity-lipe-scan3,sanity-lipe-find3
Signed-off-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Change-Id: I9714f5ea970b103652f7714c93d76be2549ad3b8
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55048
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Vitaliy Kuznetsov [Wed, 8 May 2024 15:00:41 +0000 (17:00 +0200)]
EX-9121 lipe: Add functionality for parsing tables from JSON
This patch is the second in a series of patches that implement
functionality for combining reports on size statistics and
includes functionality for reading and recording tables obtained
from different reports in JSON format.
Only affects file size statistics.
Test-Parameters: trivial testlist=sanity-lipe-scan3,sanity-lipe-find3
Signed-off-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Change-Id: I742879e61049e00b98d5f9defd7dabaea85fba0a
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55047
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Vitaliy Kuznetsov [Wed, 8 May 2024 13:04:40 +0000 (15:04 +0200)]
EX-9121 lipe: Add entry point for report merging option
This patch adds a new option for lipe_scan3 to merge
statistics reports.
This option will work like this:
lipe_scan3 --merge-reports=/dir_with_reports
File with the results:
Path to out: merged_report.out
Path to yaml: merged_report.yaml
Path to json: merged_report.json
Path to csv: merged_report.csv
This patch is the first in a series of patches to implement the
functionality for merging reports on size statistics and includes
functionality for initialization and the first entry point.
Test-Parameters: trivial testlist=sanity-lipe-scan3,sanity-lipe-find3
Signed-off-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Change-Id: Ia02c28811922e0abba52a9c2d6408da8df9ae4c2
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55046
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Vitaliy Kuznetsov [Wed, 8 May 2024 13:02:57 +0000 (15:02 +0200)]
EX-9121 lipe: Trivial improvements for report merging
Small changes that do not affect the functionality, but allow to
reuse some functions in other parts of lipe3, for example in the
utility for merging different stats reports.
Test-Parameters: trivial testlist=sanity-lipe-scan3,sanity-lipe-find3
Signed-off-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Change-Id: I1bc2d4b22e57a369acea86bf60d8f460c5b3b093
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55045
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Lai Siyao [Sat, 8 Jul 2023 20:35:43 +0000 (16:35 -0400)]
LU-15553 test: mkdir_on_mdt0 in recovery-small.sh
Many subtests in recovery-small.sh requires test dir be created on
MDT0, replace mkdir with mkdir_on_mdt0.
Fixes:
b9c4dc3c33 ("LU-14792 llite: enable filesystem-wide default LMV")
Lustre-change: https://review.whamcloud.com/51669
Lustre-commit:
3b0d2821845cf87ae7f03bf41ceae00237d94121
Test-Parameters: trivial
Test-Parameters: testlist=recovery-small,recovery-small,recovery-small
Test-Parameters: mdscount=2 mdtcount=4 testlist=recovery-small,recovery-small,recovery-small
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Ibc37b2dd25bcd94794392f5ff8a79df2e7932dcc
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55059
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Chris Horn [Wed, 23 Nov 2022 17:28:45 +0000 (10:28 -0700)]
LU-16643 lnet: Health logging improvements
LNet health activity can generate noise in console logs. The NI/Peer
NI recovery pings could be expected to fail and the related messages
from lnet_handle_recovery_reply() are generally redundant.
Improve this logging by having the lnet_monitor_thread() provide a
summary of NIs in recovery.
Another useful metric in spotting network trouble is if we have
messages exceeding their deadline. We do not currently log this
information. Keep a count of messages that have exceeded their
deadline and track the total excess time. The lnet_monitor_thread()
will then provide a summary of the number of messages and their
average excess time at a regular interval. These stats are then
reset when the monitor thread prints this information to the console.
Because NIs can be in recovery for extended periods of time, the
interval of console updates will increase from 1 to 5 minutes.
The interval is reset when it is detected that there are no longer any
NIs in recovery and there haven't been any messages past their
deadline since the last console update.
Lustre-change: https://review.whamcloud.com/50305
Lustre-commit:
0cb3d86c4004d75810c54bb897ad7fbb6d5ec05f
Test-Parameters: trivial
HPE-bug-id: LUS-11500
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I4ffffd0412806184282178ce0aca3073dd30d7e0
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55073
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Sebastien Buisson [Thu, 25 Apr 2024 16:42:44 +0000 (18:42 +0200)]
LU-17741 gss: fix lsvcgss service for systemd
Add a systemd unit file for lsvcgss service, so that the lsvcgssd
daemon can be handled correctly via systemctl.
Lustre-change: https://review.whamcloud.com/54915
Lustre-commit: TBD (from
ab83ed4cd83370f412e2e151e482bdb3cfef16dd)
Test-Parameters: trivial
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I7581996e1e28567415da0827681841ac228ad6c5
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55087
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Sebastien Buisson [Mon, 13 May 2024 10:03:16 +0000 (12:03 +0200)]
EX-9721 tests: fix sanity-sec test_64x for interop
'server_upcall' rbac value is not known by older servers.
Fixes:
b952bcb620 ("EX-9392 sec: add server_upcall rbac role")
Fixes:
b5e421625b ("EX-9392 sec: use dedicated INTERNAL upcall cache")
Test-Parameters: trivial
Test-Parameters: testgroup=review-dne-part-2
Test-Parameters: testgroup=review-dne-part-2 serverversion=EXA6.3.0
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I39a69904ce4709eacf6f08173d3cfe42e247b5bd
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55088
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Alex Zhuravlev [Mon, 13 May 2024 07:48:56 +0000 (10:48 +0300)]
LU-16430 ptlrpc: racy rq_obsolete bit modification
Racy bit modification causes assertion failure in
ptlrpc_at_remove_timed():
ASSERTION( !list_empty(&req->rq_srv.sr_timed_list) )
rq_obsolete is a bit field, so it's modification
isn't atomic and should be modified under rq_lock.
Lustre-Commit:
14ac768fd9633c5cf4474555170e5042c71a135b
Lustre-Change: https://review.whamcloud.com/49505
Change-Id: Ib1d3ad189a78b71ecf5b01585478922e984c9568
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55086
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Andreas Dilger [Wed, 8 May 2024 06:05:56 +0000 (01:05 -0500)]
RM-620 build: New tag 2.14.0-ddn146
New tag 2.14.0-ddn146
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I1b2d4c0e3121f31b82407beb974a73498edc5862
Qian Yingjin [Mon, 29 Apr 2024 02:49:57 +0000 (22:49 -0400)]
LU-17789 pcc: dont auto PCCRO attach for write/setattr
It is meaningless for a client to do auto PCCRO attach for write
and setattr operations.
Moreover, it may result in sanity-pcc/test_21d failure as
follows:
"FAIL: expected /mnt/lustre/f21d.sanity-pcc: write_mod_data,
got: write_mod_dataa"
This patch fixed it by disabling PCCRO auto attach for write and
setattr operations.
Change-Id: I894db1953a119d12e9337251c069c594fb40482a
Test-Parameters: testlist=sanity-pcc env=ONLY=21d,ONLY_REPEAT=10
Signed-off-by: Qian Yingjin <qian@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54946
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Qian Yingjin [Fri, 22 Dec 2023 09:16:07 +0000 (04:16 -0500)]
LU-17383 statahead: quit statahead with a long time wait
If the thread is not doing stat for more than a time threshold
(@sbi->ll_sa_timeout, 30 seconds by default) then it probably does
not care too much about performance, or is no longer using this
directory.
Quit the statahead thread with a long time wait in this case.
This patch also fixes defects reported by Coverity Scan for
Lustre.
Also add the lines about ll_sa_timeout in
https://review.whamcloud.com/41308
Lustre-change: https://review.whamcloud.com/53535
Lustre-commit:
cfcba1ede861faec33d797e876a0fb11eab4332a
Fixes:
e10bf68d7c3 ("LU-14361 statahead: regularized fname statahead pattern")
Test-Parameters: testlist=parallel-scale-nfsv4
Test-Parameters: testlist=parallel-scale-nfsv4
Test-Parameters: testlist=parallel-scale-nfsv4
Test-Parameters: testlist=parallel-scale-nfsv4
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: Ia7c478268fe12eeefa6dfae1b3c94451f010d1d5
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55014
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Sebastien Buisson [Tue, 12 Mar 2024 14:12:38 +0000 (15:12 +0100)]
EX-9392 sec: use dedicated INTERNAL upcall cache
Implement the INTERNAL upcall cache as a dedicated, separate cache.
This makes it distinct from the regular identity upcall cache that can
be defined to use any upcall including NONE, per an MDT side tuning.
The INTERNAL upcall cache becomes accessible only to clients that
belong to a nodemap for which the 'server_upcall' rbac role is not
enabled.
Dedicated mdt-side tunables are created to configure the entry expiry
time and the acquire expire time for INTERNAL, as well as a tunable to
flush the INTERNAL upcall cache.
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I0267182fbfa646de40ac62f832e89fbfd8477822
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54361
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Sergey Cheremencev [Sat, 20 Jan 2024 06:38:38 +0000 (14:38 +0800)]
LU-14535 quota: free lvbo in a wq
Mutex lqe_glbl_data_lock holded in
qmt_lvbo_free might be the reason of
sleeping while atommic if
cfs_hash_for_each_relax is getting a
spinlock on an upper layer:
BUG: sleeping function called from invalid
context at kernel/mutex.c:104
...
Call Trace:
dump_stack+0x19/0x1b
__might_sleep+0xd9/0x100
mutex_lock+0x20/0x40
qmt_lvbo_free+0xc7/0x380 [lquota]
mdt_lvbo_free+0x12d/0x140 [mdt]
ldlm_resource_putref+0x189/0x250 [ptlrpc]
ldlm_lock_put+0x1c8/0x760 [ptlrpc]
ldlm_export_lock_put+0x12/0x20 [ptlrpc]
cfs_hash_for_each_relax+0x3ff/0x450 [libcfs]
cfs_hash_for_each_empty+0x9a/0x210 [libcfs]
ldlm_export_cancel_locks+0xc2/0x1a0 [ptlrpc]
ldlm_bl_thread_main+0x7c8/0xb00 [ptlrpc]
kthread+0xe4/0xf0
ret_from_fork_nospec_begin+0x7/0x21
Move freeing of lvbo to a workqueue. This
patch could be probably reverted as soon
as https://review.whamcloud.com/45882 will
be landed.
Lustre-change: https://review.whamcloud.com/54107
Lustre-commit:
2cc18ece1e50c760786a13a9dcb5857d7768cb0f
Fixes:
1dbcbd70f8 ("LU-15021 quota: protect lqe_glbl_data in lqe")
Signed-off-by: Sergey Cheremencev <scherementsev@ddn.com>
Change-Id: I56aee72a7adbc6514b40689bae30669e607b5ecd
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54807
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Mikhail Pershin [Mon, 22 Jan 2024 12:58:23 +0000 (15:58 +0300)]
LU-17379 mgc: try MGS nodes faster
Re-organize import_select_connection to try all NIDs
faster at least at first round.
- check NID LNET discovery status and skip those not
discovered yet on first round, at next round just
select the least recently used one
- reset AT timeout to minimal values at first round
- track per-connection total attempts to connect,
how many were replied, discovery status and output
this in import stats
Lustre-change: https://review.whamcloud.com/54022
Lustre-commit:
94d05d0737db256a64626bfe6fa9801819230d8a
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: Ib4d043e82bf156cc3e7c9ddeff0055790edcc9ee
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54949
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Serguei Smirnov [Mon, 5 Feb 2024 20:14:30 +0000 (12:14 -0800)]
LU-17379 lnet: add LNetPeerDiscovered to LNet API
LNetPeerDiscovered is added to allow lustre check
whether the peer has been successfully discovered by LNet
before attempting to open a connection to it.
For example, given a mount command with a list of NIDs,
Lustre can use LNetAddPeer API to initiate discovery on
every candidate first, and later use LNetPeerDiscovered
to select a reachable peer to connect to.
Lustre-change: https://review.whamcloud.com/53926
Lustre-commit:
dba41355565397228f587f13a901b5d762521ed0
Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: I7c9964148a5a2a24d7889b8b4c2e488a433ca258
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54950
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>