Whamcloud - gitweb
fs/lustre-release.git
6 months agoLU-16585 build: remove python2 dependencies 76/52176/2
Alex Deiter [Tue, 21 Feb 2023 22:27:47 +0000 (02:27 +0400)]
LU-16585 build: remove python2 dependencies

Fixed packaging issue casued by scripts and control files.

Lustre-change: https://review.whamcloud.com/50084
Lustre-commit: bea3f81f84fd16d2d403682ef25b8abe314acd0f

Test-Parameters: trivial
Signed-off-by: Alex Deiter <alex.deiter@gmail.com>
Change-Id: I6c9b24bf811269928494af17c15627902e5fe27b
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Feng Lei <flei@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52176
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
6 months agoLU-16943 tests: use primary ost1 server in replay-single/135 59/52059/3
Jian Yu [Thu, 24 Aug 2023 01:01:20 +0000 (18:01 -0700)]
LU-16943 tests: use primary ost1 server in replay-single/135

This patch fixes replay-single test_135() to make sure
the primary ost1 server is used at the beginning of the test.

Lustre-change: https://review.whamcloud.com/52058
Lustre-commit: cdd8b056bff0d48155eaf4b7732d1d8880ceda55

Test-Parameters: trivial testlist=replay-single

Test-Parameters: trivial env=FAILURE_MODE=HARD \
    clientcount=4 mdtcount=1 mdscount=2 osscount=2 \
    austeroptions=-R failover=true iscsi=1 \
    testlist=replay-single,mmp

Fixes: 81418be83ed8 ("LU-16943 tests: fix replay-single/135 under hard failure mode")
Change-Id: Ia25314255c9f00ba71687e1f757517f37031caed
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52059
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
6 months agoLU-16626 build: remove python2 dependencies 26/51426/3
Alex Deiter [Thu, 9 Mar 2023 14:09:19 +0000 (18:09 +0400)]
LU-16626 build: remove python2 dependencies

Fixed packaging issue caused by zfsobj2fid script.

Lustre-change: https://review.whamcloud.com/50241
Lustre-commit: 404a1e827b0a9d86864695c8699e1ca076be6c9d

Test-Parameters: trivial
Signed-off-by: Alex Deiter <alex.deiter@gmail.com>
Change-Id: I4375038b0d2c2b42ac4080fe834d35bdd3ef54f8
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51426
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
6 months agoLU-16916 tests: fix client_evicted() not to ignore EOPNOTSUPP 68/51668/4
Jian Yu [Fri, 14 Jul 2023 05:22:18 +0000 (13:22 +0800)]
LU-16916 tests: fix client_evicted() not to ignore EOPNOTSUPP

After RHEL 9.x or Ubuntu 22.04 client is evicted, "lfs df" returns
error code 95 (EOPNOTSUPP), which is ignored in check_lfs_df_ret_val()
and then causes client_evicted() to ingore that error.

This patch fixes client_evicted() to check the return value
from "lfs df" directly so as not to ignore EOPNOTSUPP.

Lustre-change: https://review.whamcloud.com/51667
Lustre-commit: a5a9ded43b72238c2df8e0a74f03151ea3d4ce99

Test-Parameters: trivial clientdistro=el9.2 testlist=replay-vbr
Test-Parameters: trivial clientdistro=el8.8 testlist=replay-vbr

Change-Id: I633ae8769fc563b8068f433e2afae29463ac5553
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51668
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
6 months agoLU-15193 quota: expand QUOTA_MAX_TRANSIDS to 12 11/49611/3
Lei Feng [Thu, 4 Nov 2021 11:41:06 +0000 (19:41 +0800)]
LU-15193 quota: expand QUOTA_MAX_TRANSIDS to 12

In some rare cases 12 quota ids are needed.
Usually (user, group) * (block, inode) * (inode, parent) = 8 qids
are needed. But with project id,
(user, group, project) * (block, inode) * (inode, parent) = 12 qids
are needed.

Lustre-change: https://review.whamcloud.com/45456
Lustre-commit: 61481796ac85e9ab2469b8d2f4cc75088c65d298

Change-Id: I4b3ee197f6e274abda06edf60b246f089fe28d10
Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: trivial testlist=sanity-quota
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49611
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Stephane Thiell <sthiell@stanford.edu>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
6 months agoLU-16517 build: pass extra configure options to "make debs" 78/51178/4
Jian Yu [Wed, 31 May 2023 06:40:09 +0000 (23:40 -0700)]
LU-16517 build: pass extra configure options to "make debs"

While running "make debs", the configure command in debian/rules
ignores some user defined configure options. This patch fixes
the issue by adding the detection of the extra options into
debian/rules.

Lustre-change: https://review.whamcloud.com/50464
Lustre-commit: 3989529f22f5c54a98e445674b4b3cc443a3af5f

Test-Parameters: trivial clientdistro=ubuntu2004

Change-Id: Ia9db4e05abf33834cb3c853f4f0829dadc8d7400
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51178
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
6 months agoLU-16943 tests: fix replay-single/135 under hard failure mode 08/51608/4
Jian Yu [Fri, 14 Jul 2023 06:04:42 +0000 (14:04 +0800)]
LU-16943 tests: fix replay-single/135 under hard failure mode

This patch fixes replay-single test_135() to load libcfs module
on the failover partner node to avoid 'fail_val' setting error.
It also fixes the issue that not all of the OSTs are mounted after
failing back ost1.

Lustre-change: https://review.whamcloud.com/51574
Lustre-commit: 74140e5df4c094f7f0e923e1b82c464b18e8a7cc

Test-Parameters: trivial testlist=replay-single
Test-Parameters: trivial fstype=zfs testlist=replay-single

Test-Parameters: trivial env=FAILURE_MODE=HARD \
    clientcount=4 mdtcount=1 mdscount=2 osscount=2 \
    austeroptions=-R failover=true iscsi=1 \
    testlist=replay-single

Change-Id: Id46c722a6db9d832829a739f41f7462b32a6d9d9
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51608
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
6 months agoLU-15740 tests: scale fs_log_size by OSTCOUNT 06/51606/2
Andreas Dilger [Fri, 24 Mar 2023 23:09:44 +0000 (17:09 -0600)]
LU-15740 tests: scale fs_log_size by OSTCOUNT

The fs_log_size "free space skew" was being scaled by MDSCOUNT,
but in fact this parameter is only ever used to compare the OST
free space usage, so the OSTCOUNT should be used when scaling it.

It is likely that the skew is actually caused by blocks allocated
by OST object directories and not llogs (no llogs used on OSTs for
many years), but it isn't worthwhile to rename the function.

Lustre-change: https://review.whamcloud.com/50419
Lustre-commit: fabec6f2cb39950a2f208567dac716e21880fa9f

Test-Parameters: trivial testlist=runtests
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I97f05b10fa7ec367534b5bdce09feae5e93ebbe5
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51606
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
6 months agoLU-16934 kernel: update RHEL 8.8 [4.18.0-477.15.1.el8_8] 18/51518/3
Jian Yu [Sat, 29 Jul 2023 03:36:03 +0000 (20:36 -0700)]
LU-16934 kernel: update RHEL 8.8 [4.18.0-477.15.1.el8_8]

Update RHEL 8.8 kernel to 4.18.0-477.15.1.el8_8.

Lustre-change: https://review.whamcloud.com/51517
Lustre-commit: 830bf7a1f8de73a4f46248e6b8d2bbcd944a1f09

Test-Parameters: trivial fstype=ldiskfs \
clientdistro=el8.8 serverdistro=el8.8 testlist=sanity

Test-Parameters: trivial fstype=zfs \
clientdistro=el8.8 serverdistro=el8.8 testlist=sanity

Change-Id: I66365dce63065a0a07958a182a3c705e9948d424
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51518
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: xinliang <xinliang.liu@linaro.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
6 months agoLU-16060 osd-ldiskfs: copy nul byte terminator in writelink 56/51356/3
Alexander Zarochentsev [Wed, 20 Jul 2022 16:05:53 +0000 (19:05 +0300)]
LU-16060 osd-ldiskfs: copy nul byte terminator in writelink

memcpy() call in osd_ldiskfs_writelink() doesn't copy the nul
terminator byte from the source buffer, leaving the space
after target link name uninialized which is ok for the kernel
code and debugfs but not e2fsck.

HPE-bug-id: LUS-11103

Lustre-change: https://review.whamcloud.com/48092
Lustre-commit: 907dc0a2d333f2df2d654a968fc50f8cc05b779d

Signed-off-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Change-Id: I914f2c78e1a6571bf360a23b0ede8c70502bf0df
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51356
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
6 months agoLU-15519 quota: fallocate does not increase projectid usage 35/51535/3
Arshad Hussain [Mon, 14 Feb 2022 08:36:47 +0000 (14:06 +0530)]
LU-15519 quota: fallocate does not increase projectid usage

fallocate() was not accounting for projectid quota usage.
This was happening due to two reasons. 1) the projectid
was not properly passed to md_op_data in ll_set_project()
and 2) the OBD_MD_FLPROJID flag was not set receive the
projctid.

This patch addresses the above reasons.

Test-case: sanity-quota/78a added

Lustre-change: https://review.whamcloud.com/46676
Lustre-commit: 5fc934ebbbe665f24e2f11fe224065dd8e9a08ba

Fixes: 48457868a02a ("LU-3606 fallocate: Implement fallocate preallocate operation")
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I3ed44e7ef7ca8fe49a08133449c33b62b1eff500
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51535
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
6 months agoLU-16873 osd: update OI_Scrub file with new magic 25/51525/2
Alexander Zarochentsev [Sun, 28 May 2023 12:42:27 +0000 (08:42 -0400)]
LU-16873 osd: update OI_Scrub file with new magic

The fix for LUS-11542 detects the format change correctly
but does not write new oi scrub file magic, so new mount
triggers the "oi files counter reset" again and again.

Lustre-change: https://review.whamcloud.com/51226
Lustre-commit: 38b7c408212f60d684c9b114d90b4514e0044ffe

Fixes: 126275ba83 ("LU-16655 scrub: upgrade scrub_file from 2.12 format")
HPE-bug-id: LUS-11646
Signed-off-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Change-Id: Ia13fcfaf0d8f2c4ee9331dd9fec0ff159d195186
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51525
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-15800 ofd: take a read lock for fallocate 02/51702/2
Alex Zhuravlev [Tue, 10 May 2022 07:48:55 +0000 (10:48 +0300)]
LU-15800 ofd: take a read lock for fallocate

there is no need to take an write (exclusive) object's
lock for fallocate - we just need to serialize fallocate
vs destroy, all internal structures should be protected
by OSD and disk filesystem like the write path does.

Lustre-change: https://review.whamcloud.com/47268
Lustre-commit: 5fae80066162ea637c8649f6439fc14e1d9a7cf8

Fixes: cdaaa87f6b ("LU-14214 ofd: fix locking in ofd_object_fallocate()")
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I65986745865ee329c5257a7efca5e79403830608
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51702
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-11787 test: Fix checkfilemap tests for 64K page 87/51287/2
James Simmons [Mon, 31 Jan 2022 17:44:46 +0000 (12:44 -0500)]
LU-11787 test: Fix checkfilemap tests for 64K page

File mapping is page size aligned. Modify the tests to handle 64K
page.

Lustre-change: https://review.whamcloud.com/45629
Lustre-commit: 7c88dfd28b5cc6114a85f187ecb2473657d42c9d

Test-Parameters: trivial clientdistro=el8.7 clientarch=aarch64 testlist=sanityn env=ONLY="71a 71b"
Change-Id: I316a197db8cdd0f9064431f8c572b43adf6110b8
Signed-off-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Xinliang Liu <xinliang.liu@linaro.org>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51287
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-14668 tests: verify state of peer added with '--lock_prim' 35/51135/3
Serguei Smirnov [Thu, 9 Mar 2023 23:00:46 +0000 (15:00 -0800)]
LU-14668 tests: verify state of peer added with '--lock_prim'

Add peer state verification to sanity-lnet test_26:
check that peer state has corresponding bit set for a peer
created with '--lock_prim' option.

Lustre-change: https://review.whamcloud.com/50249
Lustre-commit: 9b6fcfa334b153e52caec16d4cfd180306826a3a

Test-Parameters: trivial testlist=sanity-lnet
Fixes: 05f7f6a0b ("LU-14668 lnet: add 'force' option to lnetctl peer del")
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: Id5fde036907f9dd19a21e8e6611a070321310f0e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51135
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-14668 lnet: add 'lock_prim_nid" lnet module parameter 34/51134/4
Serguei Smirnov [Tue, 28 Feb 2023 23:02:20 +0000 (15:02 -0800)]
LU-14668 lnet: add 'lock_prim_nid" lnet module parameter

Add 'lock_prim_nid' lnet module parameter to allow control
of how Lustre peer primary NID is selected.
If set to 1 (default), the NID specified by Lustre when
calling LNet API is designated as primary for the peer,
allowing for non-blocking discovery in the background.
If set to 0, peer discovery is blocking until complete
and the NID listed first in discovery response is designated
as primary.

Lustre-change: https://review.whamcloud.com/50159
Lustre-commit: fc7a0d6013b46ebc17cdfdccc04a5d1d92c6af24

Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: I6ed1cb0c637f4aa7a7340a6f01819ba9a85858f4
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51134
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-14668 lnet: add 'force' option to lnetctl peer del 33/51133/3
Serguei Smirnov [Mon, 27 Feb 2023 23:41:19 +0000 (15:41 -0800)]
LU-14668 lnet: add 'force' option to lnetctl peer del

Add --force option to 'lnetctl peer del' command.
If the peer has primary NID locked, this option allows
for the peer to be deleted manually:
lnetctl peer del --prim_nid <nid> --force

Add --prim_lock option to 'lnetctl peer add' command.
If specified, the primary NID of the peer is locked
such that it is going to be the NID used to identify
the peer in communications with Lustre layer.

Lustre-change: https://review.whamcloud.com/50149
Lustre-commit: f1b2d8d60c593a670b36006bcf9b040549d8c13a

Test-Parameters: trivial
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: Ia6001856cfbce7b0c3288cff9b244b569d259647
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51133
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-14668 lnet: don't delete peer created by Lustre 32/51132/3
Amir Shehata [Thu, 6 May 2021 06:02:22 +0000 (23:02 -0700)]
LU-14668 lnet: don't delete peer created by Lustre

Peers created by Lustre have their primary NIDs locked.
If that peer is deleted, it'll confuse lustre. So when manually
deleting a peer using:
   lnetctl peer del --prim_nid ...
We must continue to preserve the primary NID. Therefore we delete
all the constituent NIDs, but keep the primary NID. We then
flag the peer for rediscovery.

Lustre-change: https://review.whamcloud.com/43565
Lustre-commit: 7cc5b4329fc2eecbf09dbda85efe58f4ad5a32b9

Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I34eef9b0049435a01fde87dc8263dd50f631c551
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51132
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-14668 lnet: Peers added via kernel API should be permanent 31/51131/3
Chris Horn [Tue, 25 May 2021 16:17:49 +0000 (11:17 -0500)]
LU-14668 lnet: Peers added via kernel API should be permanent

The LNetAddPeer() API allows Lustre to predefine the Peer for LNet.
Originally these peers would be temporary and potentially re-created
via discovery. Instead, let's make these peers permanent. This allows
Lustre to dictate the primary NID of the peer. LNet makes sure this
primary NID is not changed afterwards.

Lustre-change: https://review.whamcloud.com/43788
Lustre-commit: 41733dadd8ad0e87e44dd19e25e576e90484cb9b

Test-Parameters: trivial
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I3f54c04719c9e0374176682af08183f0c93ef737
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51131
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-14668 lnet: Lock primary NID logic 30/51130/2
Amir Shehata [Wed, 5 May 2021 18:35:06 +0000 (11:35 -0700)]
LU-14668 lnet: Lock primary NID logic

If a peer is created by Lustre make sure to lock that peer's
primary NID. This peer can be discovered in the background.
There is no need to block until discovery is complete, as Lustre
can continue on with the primary NID it provided.

Discovery will populate the peer with other interfaces the peer has
but will not change the peer's primary NID. It can also delete
peer's NIDs which Lustre told it about (not the Primary NID).

If a peer has been manually discovered via
   lnetctl discover <nid>
command, then make sure to delete the manually discovered
peer and recreate it with the Lustre NID information
provided for us.

Lustre-change: https://review.whamcloud.com/50106
Lustre-commit: aacb16191a72bc6db1155030849efb0d6971a572

Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I8fc8a69caccca047e3085bb33d026a3f09fb359b
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51130
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-16717 mdt: resume dir migration with bad_type 43/51243/2
Lai Siyao [Fri, 28 Apr 2023 09:22:03 +0000 (05:22 -0400)]
LU-16717 mdt: resume dir migration with bad_type

LFSCK may set hash type to "none,bad_type" upon migration failure,
set it back to "fnv_1a_64,migrating,bad_type,fixed" to allow
migration resumption. fnv_1a_64 is set because it's the default hash
type, and now that we don't know the hash type in the original
migration command, just try with it.

LFSCK just add "bad_type" flag on such directory, so that such
migration can always be resumed in the future.

Add sanity 230z.

Lustre-change: https://review.whamcloud.com/50797
Lustre-commit: 151650e468ab423e831c30d635ea380e0434a122

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I19606aefcb9115e6724843785aea89a1c380e23f
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51243
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-16052 llog: handle -EBADR for catalog processing 72/48772/5
Mikhail Pershin [Mon, 17 Oct 2022 23:29:52 +0000 (16:29 -0700)]
LU-16052 llog: handle -EBADR for catalog processing

Llog catalog processing might retry to get the last llog block
to check for new records if any. That might return -EBADR code
which should be considered as valid. Previously -EIO was
returned in all cases.

Run conf-sanity test_106 several times as specific test

Lustre-change: https://review.whamcloud.com/48070
Lustre-commit: e260f751f2a21fa126eeb4bc9e94250ba3e815f1

Test-Parameters: testlist=conf-sanity env=ONLY=106,SLOW=yes,ONLY_REPEAT=10
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I30e04ba2c91c8bdce72c95675a1209639e9f0570
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48772
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-6612 utils: strengthen llog_reader vs wrong format/header 00/48900/4
Bruno Faccini [Wed, 22 Feb 2023 19:21:06 +0000 (11:21 -0800)]
LU-6612 utils: strengthen llog_reader vs wrong format/header

The following snippet shows that llog_reader can be puzzled due to
an invalid 0 for the number of records when parsing an expected
LLOG file header :
root# dd if=/dev/zero bs=4096 count=1 of=/tmp/zeroes
1+0 records in
1+0 records out
4096 bytes (4.1 kB) copied, 0.000263962 s, 15.5 MB/s
root# llog_reader /tmp/zeroes
Memory Alloc for recs_buf error.
Could not pack buffer; rc=-12

Lustre-change: https://review.whamcloud.com/15654
Lustre-commit: 45291b8c06eebf33d3654db3a7d3cfc5836004a6

Test-Parameters: trivial testlist=sanity,sanity-hsm
Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Change-Id: I12be79e6c6a5da384a5fd81878a76a7ea8aa5834
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48900
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
9 months agoLU-15481 llog: Add LLOG_SKIP_PLAIN to skip llog plain 71/48771/5
Etienne AUJAMES [Wed, 22 Feb 2023 19:18:49 +0000 (11:18 -0800)]
LU-15481 llog: Add LLOG_SKIP_PLAIN to skip llog plain

Add the catalog callback return LLOG_SKIP_PLAIN to conditionally skip
an entire llog plain.

This could speedup the catalog processing for specific usages when a
record need to be access in the "middle" of the catalog. This could
be usefull for changelog with several users or HSM.

This patch modify chlg_read_cat_process_cb() to use LLOG_SKIP_PLAIN.
The main idea came from: d813c75d ("LU-14688 mdt: changelog purge
deletes plain llog")

**Performance test:**

* Environement:
2474195 changelogs record store on the mds0 (40 llog plain):
mds# lctl get_param -n mdd.lustrefs-MDT0000.changelog_users
current index: 2474195
ID    index (idle seconds)
cl1   0 (3509)

* Test
Access to records at the end of the catalog (offset: 2474194):
client# time lfs changelog lustrefs-MDT0000 2474194 >/dev/null

* Results
- with the patch:  real    0m0.592s
- without the patch: real    0m17.835s (x30)

Lustre-change: https://review.whamcloud.com/46310
Lustre-commit: aa22a6826ee521ab14994a4533b0dbffb529aab0

Signed-off-by: Etienne AUJAMES <etienne.aujames@cea.fr>
Change-Id: I887d5bef1f3a6a31c46bc58959e0f508266c53d2
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48771
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Etienne AUJAMES <eaujames@ddn.com>
9 months agoLU-16717 mdt: treat unknown hash type as sane type 35/51235/2
Lai Siyao [Sun, 23 Apr 2023 08:09:02 +0000 (04:09 -0400)]
LU-16717 mdt: treat unknown hash type as sane type

Directory migration failure may leave directory hash type as
LMV_HASH_TYPE_UNKNOWN|LMV_HASH_FLAG_BAD_TYPE, which should be treated
as sane hash type on existing directories, otherwise such directories
can't be unlinked.

Add sanity 230y.

Lustre-change: https://review.whamcloud.com/50796
Lustre-commit: 05cdb71ba6813570123613993f3cfcf74fc83561

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Ieffc0808d1db989d0bf9723f05cddb06f349e208
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50796
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51235
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-14294 tests: fixed NFS configuration issue 83/51283/4
Alex Deiter [Mon, 7 Nov 2022 17:47:21 +0000 (21:47 +0400)]
LU-14294 tests: fixed NFS configuration issue

* Used the systemctl command to manage system services
* Used the same order of parameters to setup and cleanup NFS
* Used tab for indentation

Lustre-change: https://review.whamcloud.com/49062
Lustre-commit: 1a8fe55b17ac2bc2195aaba446467ccdac67b564

Test-Parameters: trivial clientdistro=el7.9 \
testlist=parallel-scale-nfsv3,parallel-scale-nfsv4
Test-Parameters: clientdistro=el8.7 \
testlist=parallel-scale-nfsv3,parallel-scale-nfsv4
Test-Parameters: clientdistro=el9.0 \
testlist=parallel-scale-nfsv3,parallel-scale-nfsv4
Test-Parameters: clientdistro=sles12sp5 \
testlist=parallel-scale-nfsv3,parallel-scale-nfsv4
Test-Parameters: clientdistro=sles15sp4 \
testlist=parallel-scale-nfsv3,parallel-scale-nfsv4
Test-Parameters: clientdistro=ubuntu2004 \
testlist=parallel-scale-nfsv3,parallel-scale-nfsv4

Change-Id: I6b087035ac7524aa99c0facad48f8c3fb7444cbc
Signed-off-by: Alex Deiter <adeiter@tintri.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51283
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-16163 tests: skip racer_on_nfs for NFSv3 82/51282/3
Alex Deiter [Fri, 7 Apr 2023 19:49:23 +0000 (23:49 +0400)]
LU-16163 tests: skip racer_on_nfs for NFSv3

Export ALWAYS_EXCEPT env for child NFS test

Lustre-change: https://review.whamcloud.com/50579
Lustre-commit: 892d726f274c7cd4e505689ad69194ac68dc323b

Fixes: 513eb670b0 ("LU-16163 tests: skip racer_on_nfs for NFSv3")
Test-Parameters: trivial testlist=parallel-scale-nfsv3
Signed-off-by: Alex Deiter <adeiter@tintri.com>
Change-Id: Ibb4a9916166f13ab9bd2374b33d4313453972276
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51282
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-11388 tests: replay-single/131b to refresh grants 89/51289/2
Alex Zhuravlev [Mon, 17 Apr 2023 18:13:59 +0000 (21:13 +0300)]
LU-11388 tests: replay-single/131b to refresh grants

so that the write (to be replayed after replay-barrier)
doesn't turn sync due to insufficient grant.

Lustre-change: https://review.whamcloud.com/50661
Lustre-commit: 384e1e858eef826677bfa6913074a83c4fab37d3

Test-Parameters: trivial testlist=replay-single env=ONLY=131b,ONLY_REPEAT=30
Fixes: cb3b2bb683 ("LU-11388 test: enable replay-single test_131b")
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: If4656c1028b49c58eedd905abd0c329f3706f491
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51289
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-11785 tests: fix conf-sanity/98 mount check on 64K page 88/51288/2
Kevin Zhao [Fri, 28 Oct 2022 02:05:24 +0000 (10:05 +0800)]
LU-11785 tests: fix conf-sanity/98 mount check on 64K page

This patch fix the mount option length check expectation
fail on 64K page. Since the maxopt_len is the minmium
value of page_size or 64K page_size, but the test cases
only hard code the length of option to the 4K one. This
patch add the mount options according to the page size.

Lustre-change: https://review.whamcloud.com/48177
Lustre-commit: 4068ca725954db2a1fc42bf8d184f4672c2ed113

Test-Parameters: trivial testlist=conf-sanity env=ONLY=98
Test-Parameters: testlist=conf-sanity env=ONLY=98 clientarch=aarch64 clientdistro=el8.7
Signed-off-by: Kevin Zhao <kevin.zhao@linaro.org>
Change-Id: Icdeb8b73308056e216c3f4ce71907b0c928d2c30
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: xinliang <xinliang.liu@linaro.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51288
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-13081 tests: skip sanity test_151/test_156 86/51286/2
Alex Deiter [Wed, 26 Apr 2023 22:04:01 +0000 (02:04 +0400)]
LU-13081 tests: skip sanity test_151/test_156

Skip both sanity test_151 and test_156 during interop testing,
since this is really testing server-side functionality only
(OSS caching behavior). And it makes sense to just exclude
test_151 and test_156 during interop testing, otherwise it
seems that the client version of the test can become
inconsistent with the caching behavior/tunables on the OSS
and the failures don't mean anything. There is enough
non-interop testing to catch any regressions in the OSS
cache behavior.

Lustre-change:  https://review.whamcloud.com/50777
Lustre-commit: 305dda878d1dde822eab7a9dacfe8dec0b96cb3e

Test-Parameters: trivial
Signed-off-by: Alex Deiter <adeiter@tintri.com>
Change-Id: I39a8b54894d5b0c7573e6c56d1f8e1ba02b3e3fe
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51286
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
9 months agoLU-15123 tests: check quota reintegration after recovery 33/51233/2
Alex Zhuravlev [Wed, 19 Apr 2023 07:20:33 +0000 (10:20 +0300)]
LU-15123 tests: check quota reintegration after recovery

4th step of quota reintegration (reconciliation) waits for recovery
completion. So the tests (like sanity-quota/7a) should wait for
recovery completion before checking reintegration results.

Lustre-change: https://review.whamcloud.com/50688
Lustre-commit: 4432b6e2824775e292f96e202d6fc0db231bc749

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Id0aa5db01658621103d94ad6dafe91b2960b3a33
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sergey Cheremencev <scherementsev@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51233
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-14377 tests: make parallel-scale/rr_alloc less strict 42/51142/4
Andreas Dilger [Wed, 19 Oct 2022 00:37:58 +0000 (18:37 -0600)]
LU-14377 tests: make parallel-scale/rr_alloc less strict

test_rr_alloc() sometimes fails with a difference of 3-4 objects
per OST, after creating 1500+ objects on each OST.  This should
not be considered fatal.  Make the test more lenient, and allow
a difference of up to 0.3% of objects between the OSTs.

Fix some code style issues in the test.

Lustre-change: https://review.whamcloud.com/48914
Lustre-commit: b104c0a27713899a4d047f56fed57c30c39b8195

Test-Parameters: trivial testlist=parallel-scale env=ONLY=rr_alloc
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ib6ba8c5d8e9d3245833448a52f8ed25308698a33
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Elena Gryaznova <elena.gryaznova@hpe.com>
(cherry picked from commit b104c0a27713899a4d047f56fed57c30c39b8195)
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51142
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-15821 ldlm: Prioritize blocking callbacks 10/49610/2
Patrick Farrell [Thu, 5 May 2022 00:50:57 +0000 (20:50 -0400)]
LU-15821 ldlm: Prioritize blocking callbacks

The current code places bl_ast lock callbacks at the end of
the global BL callback queue.  This is bad because it
causes urgent requests from the server to wait behind
non-urgent cleanup tasks to keep lru_size at the right
level.

This can lead to evictions if there is a large queue of
items in the global queue so the callback is not serviced
in a timely manner.

Put bl_ast callbacks on the priority queue so they do not
wait behind the background traffic.

Add some additional debug in this area.

Lustre-change: https://review.whamcloud.com/47215
Lustre-commit: 2d59294d52b696125acc464e5910c893d9aef237

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ic6eb65819a4a93e9d30e807d386ca18380b30c7d
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49610
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Stephane Thiell <sthiell@stanford.edu>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoPrepare for next pointrelease.
Oleg Drokin [Tue, 18 Jul 2023 14:54:54 +0000 (10:54 -0400)]
Prepare for next pointrelease.

Change-Id: Idc2b25dea2b4c0f587f735dc1bdf2dd358d1f647

10 months agoMew release 2.15.3 2.15.3 v2_15_3
Oleg Drokin [Mon, 19 Jun 2023 23:58:09 +0000 (19:58 -0400)]
Mew release 2.15.3

Change-Id: Iad4584f4bc2345339ce9dd507483e47e9d59daba
Signed-off-by: Oleg Drokin <green@whamcloud.com>
11 months agoNew RC 2.15.3-RC1 2.15.3-RC1 v2_15_3-RC1
Oleg Drokin [Sat, 27 May 2023 01:30:09 +0000 (21:30 -0400)]
New RC 2.15.3-RC1

Change-Id: I85e435b6168982fd89f7f155cadfd1493db57c5a
Signed-off-by: Oleg Drokin <green@whamcloud.com>
11 months agoLU-16756 kernel: update RHEL 9.2 [5.14.0-284.11.1.el9_2] 08/50908/4
Jian Yu [Tue, 23 May 2023 07:33:14 +0000 (00:33 -0700)]
LU-16756 kernel: update RHEL 9.2 [5.14.0-284.11.1.el9_2]

Update RHEL 9.2 kernel to 5.14.0-284.11.1.el9_2 for Lustre client.

Lustre-change: https://review.whamcloud.com/50907
Lustre-commit: TBD (from 5d6850eea510acfb74d08e86afebc253d29ca7fd)

Test-Parameters: trivial env=SANITY_EXCEPT="27J 101j" clientdistro=el9.2 testlist=sanity

Change-Id: I408132158c9824b601e07fde645ceb235c3e7a49
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50908
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Peter Jones <pjones@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 months agoLU-16782 kernel: update RHEL 9.1 [5.14.0-162.23.1.el9_1] 88/50788/5
Jian Yu [Tue, 23 May 2023 07:30:28 +0000 (00:30 -0700)]
LU-16782 kernel: update RHEL 9.1 [5.14.0-162.23.1.el9_1]

Update RHEL 9.1 kernel to 5.14.0-162.23.1.el9_1 for Lustre client.

Lustre-change: https://review.whamcloud.com/50785
Lustre-commit: 82a19dc322f998f2429f14706839db58449f1ae4

Test-Parameters: trivial clientdistro=el9.1 \
env=SANITY_EXCEPT=101j testlist=sanity

Test-Parameters: trivial serverdistro=el8.7 clientdistro=el9.1 \
env=SANITY_EXCEPT=101j testlist=sanity

Change-Id: I961bac2129b98da2950694fa03e0bf47b780d85c
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50788
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Peter Jones <pjones@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 months agoLU-16755 kernel: update RHEL 8.8 [4.18.0-477.10.1.el8_8] 52/51052/3
Jian Yu [Tue, 23 May 2023 07:26:18 +0000 (00:26 -0700)]
LU-16755 kernel: update RHEL 8.8 [4.18.0-477.10.1.el8_8]

Update RHEL 8.8 kernel to 4.18.0-477.10.1.el8_8.

Lustre-change: https://review.whamcloud.com/51051
Lustre-commit: TBD (from 5c47d1454a24dc6a5d330575546f5388652414ce)

Test-Parameters: trivial fstype=ldiskfs \
clientdistro=el8.8 serverdistro=el8.8 testlist=sanity

Test-Parameters: trivial fstype=zfs \
clientdistro=el8.8 serverdistro=el8.8 testlist=sanity

Change-Id: I6d7703512f9c5a8b686f06e94f32f0e51c9b2001
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51052
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 months agoLU-12353 ldiskfs: add ext4-dquot-commit-speedup patch to more series 83/50983/4
Jian Yu [Fri, 12 May 2023 19:00:15 +0000 (12:00 -0700)]
LU-12353 ldiskfs: add ext4-dquot-commit-speedup patch to more series

Add ext4-dquot-commit-speedup.patch to RHEL 8.x ldiskfs patch series.

Lustre-change: https://review.whamcloud.com/50853
Lustre-commit: TBD (from 06da805983c298f0957decfdb1d08cf7c39fd99b)

Test-Parameters: trivial clientdistro=el8.7 serverdistro=el8.7 testlist=sanity
Test-Parameters: trivial clientdistro=el8.8 serverdistro=el8.8 testlist=sanity

Change-Id: Ib0ac325bde442b4eafedde9ba44984b02d5ea061
Fixes: dad25f258e50 ("LU-12353 ldiskfs: speedup quota journalling")
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50983
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
11 months agoLU-16649 llite: EIO is possible on a race with page reclaim 00/50600/3
Patrick Farrell [Tue, 9 May 2023 18:48:03 +0000 (14:48 -0400)]
LU-16649 llite: EIO is possible on a race with page reclaim

We must clear the 'uptodate' page flag when we delete a
page from Lustre, or stale reads can occur.  However,
generic_file_buffered_read requires any pages returned from
readpage() be uptodate.

So, we must retry reading if page truncation happens in
parallel with the read.

This implements the same fix as:
https://review.whamcloud.com/49647
b4da788a819f82d35b685d6ee7f02809c05ca005

did for the mmap path.

Lustre-change: https://review.whamcloud.com/50344
Lustre-commit: 1d98e5c32b41e19bb1247958e666bb66e69dbc4c

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Iae0d1eb343f25a0176135347e54c309056c2613a
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50600
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 months agoRevert "LU-14541 llite: Check vmpage in releasepage" 99/50599/5
Patrick Farrell [Tue, 9 May 2023 15:13:38 +0000 (11:13 -0400)]
Revert "LU-14541 llite: Check vmpage in releasepage"

This reverts commit c524079f4f59a39b99467d9868ee4aafdcf033e9,
because it breaks releasepage for Lustre and does not
completely fix the data consistency issue in LU-14541.

Breaking releasepage matters because it prevents direct I/O
from working if there is page cache data present, and
because it causes similar issues with GDS, which must be
able to flush page cache pages before doing I/O.

With patches:
"LU-16160 llite: SIGBUS is possible on a race with page reclaim"/
d9c23a7934747eb19e23470b30806482a1aa60f8
and
"LU-14541 llite: Check for page deletion after fault"/
19678e30147f50f813e72e8216cfb0453fe0ca6e
LU-14541 is fully resolved, so we can revert this patch.

Lustre-change: https://review.whamcloud.com/49654
Lustre-commit: e3cfb688ed7116a57b2c7f89a3e4f28291a0b69f

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I613bdb4f27161ffc3638d1d8ea38827af5a7bd47
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50599
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 months agoLU-14541 llite: Check for page deletion after fault 98/50598/4
Patrick Farrell [Tue, 9 May 2023 15:06:47 +0000 (11:06 -0400)]
LU-14541 llite: Check for page deletion after fault

Before completing a page fault and returning to the kernel,
we lock the page and verify it has not been truncated.  But
we must also verify the page has not been deleted from
Lustre, or we can return a disconnected (ie, not tracked by
Lustre) page to the kernel.

We mark deleted pages !uptodate, but this doesn't matter
for faulted pages, because the kernel assumes they are
returned uptodate, and maps them in to the process address
space.  Once mapped, the page state is not checked until
the page is unmapped.

But because the page is referenced by the mapping, it stays
in the page cache even though it's been disconnected from
Lustre.

Because the page is disconnected from Lustre, it will not
be found and cancelled on lock cancellation.  This can
result in stale data reads.

This is particularly an issue with releasepage (called from
drop_caches or under memory pressure), which can delete
pages separately from cancelling covering locks.

If releasepage is disabled, which is effectively what
"LU-14541 llite: Check vmpage in releasepage"
does, this is not an issue.  But disabling releasepage
causes other problems and is incorrect anyway.

Lustre-change: https://review.whamcloud.com/49653
Lustre-commit: b3d2114e538cf95a7e036f8313e9095fe821da79

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: If1164db8f8e92a1cf811431d56d15f30d8eb3faa
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50598
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 months agoLU-16165 sec: retry mechanism for identity cache 46/49746/2
Sebastien Buisson [Fri, 16 Sep 2022 16:02:51 +0000 (18:02 +0200)]
LU-16165 sec: retry mechanism for identity cache

Implement a retry mechanism in the identity cache in case the
identity up call times out.

Lustre-change: https://review.whamcloud.com/48579
Lustre-commit: 61c3b3a9bb848e256845462ffd79b15565cd23ad

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Ib70d3b851a6da3cf66dfed49b03be51da7886d01
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49746
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 months agoLU-16091 enc: S_ENCRYPTED flag on OST objects for enc files 57/50657/3
Sebastien Buisson [Thu, 11 Aug 2022 15:08:11 +0000 (17:08 +0200)]
LU-16091 enc: S_ENCRYPTED flag on OST objects for enc files

Add a dumb encryption context on OST objects being created, when the
LUSTRE_ENCRYPT_FL flag gets set in the LMA, for ldiskfs backend
targets. This leads ldiskfs to internally set the LDISKFS_ENCRYPT_FL
flag on the on-disk inode. Also, it makes e2fsprogs happy to see an
enc ctx for an inode that has the LDISKFS_ENCRYPT_FL flag.

Add a dumb encryption context on OST objects being opened, if there is
not already one, for ldiskfs backend targets. This is done by adding
the LUSTRE_ENCRYPT_FL flag if necessary, at the same time as atime
gets updated. It is some sort of live self-check that fixes OST
objects created with an older Lustre version.

Enhance lfsck to detect and fix OST objects belonging to encrypted
files that are missing the encryption flag. This is implemented in the
MDT-OST consistency routine, as part of the layout checking.

Also add sanity-sec test_62 and sanity-lfsck test_42 to exercise this.

Note this patch does not add any dumb encryption context on OST
objects when the backend is ZFS.

Lustre-change: https://review.whamcloud.com/48198
Lustre-commit: 348446d6370b3f63f0da8a96997b3295f896c6fb

Test-Parameters: testlist=sanity-sec mdscount=2 mdtcount=4 osscount=1 ostcount=8 clientcount=2 fstype=zfs
Test-Parameters: testlist=sanity-sec mdscount=2 mdtcount=4 osscount=1 ostcount=8 clientcount=2 fstype=zfs
Test-Parameters: testlist=sanity-sec mdscount=2 mdtcount=4 osscount=1 ostcount=8 clientcount=2 fstype=zfs
Test-Parameters: testlist=sanity-sec mdscount=2 mdtcount=4 osscount=1 ostcount=8 clientcount=2 fstype=zfs
Test-Parameters: testlist=sanity-sec mdscount=2 mdtcount=4 osscount=1 ostcount=8 clientcount=2 fstype=zfs
Test-Parameters: testlist=sanity-sec mdscount=2 mdtcount=4 osscount=1 ostcount=8 clientcount=2 fstype=zfs
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I6bee3c82ee4d1a52275facf9e2b0d60061e0beef
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50657
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 months agoLU-15374 tests: check FULL and IDLE for client import state 29/50629/3
Jian Yu [Thu, 13 Apr 2023 18:06:48 +0000 (11:06 -0700)]
LU-15374 tests: check FULL and IDLE for client import state

The client-to-OST import state can be FULL or IDLE.

Lustre-change: https://review.whamcloud.com/49298
Lustre-commit: 25fb82eb413389b6023e0e61f7efb71e91d15c01

Test-Parameters: trivial testgroup=review-dne-part-3

Test-Parameters: trivial env=SLOW=no,FAILURE_MODE=HARD \
    clientcount=4 mdtcount=1 mdscount=2 osscount=2 \
    austeroptions=-R failover=true iscsi=1 \
    testlist=recovery-mds-scale

Fixes: 25606a2ce1 ("LU-15342 tests: escape "|"")
Fixes: 3da8f014fd ("LU-12857 tests: allow clients to be IDLE after recovery")
Fixes: 5a6ceb664f ("LU-7236 ptlrpc: idle connections can disconnect")

Change-Id: I3582ceb273d241ee71fe907f6d1423746e453faa
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50629
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
11 months agoLU-16795 build: Update ZFS version to 2.1.11 63/50863/3
Jian Yu [Thu, 4 May 2023 20:07:10 +0000 (13:07 -0700)]
LU-16795 build: Update ZFS version to 2.1.11

Update ZFS version to 2.1.11. The changes are listed in:
https://github.com/openzfs/zfs/releases/tag/zfs-2.1.11

Lustre-change: https://review.whamcloud.com/50856
Lustre-commit: f827e1e04e36f3f8533c321c1248c7b0a62da287

Change-Id: I51de752fd82174586bcda7a9a42152b9fb2111bd
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50863
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Colin Faber <cfaber@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
11 months agoLU-15626 tests: Fix "error" reported by shellcheck for recovery-mds-scale 28/50628/2
Arshad Hussain [Thu, 13 Apr 2023 18:01:20 +0000 (11:01 -0700)]
LU-15626 tests: Fix "error" reported by shellcheck for recovery-mds-scale

This patch fixes "error" issues reported by shellcheck
for file lustre/tests/recovery-mds-scale.sh. This patch
also moves spaces to tabs.

Lustre-change: https://review.whamcloud.com/46865
Lustre-commit: a98728e4fd673ebe7a7d1d3f15a5a06d1efec9e3

Test-Parameters: trivial clientcount=6 mdtcount=2 mdscount=2 osscount=2 austeroptions=-R failover=true iscsi=1 env=FAILOVER_PERIOD=180,DURATION=600,SLOW=no testlist=recovery-mds-scale
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I6c098809835950e1f781e04a6898895592407948
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50628
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 months agoLU-15412 tests: Let init_clients_lists() export client vars 27/50627/2
Xinliang Liu [Thu, 13 Apr 2023 17:57:34 +0000 (10:57 -0700)]
LU-15412 tests: Let init_clients_lists() export client vars

init_clients_lists() counts the value of client related variables
correctly. So let it define and export these variables.

This patch can fix sanity test 807 stuck issue when running on
multi-node Lustre cluster and CLIENTS is empty.

Also cleanup client count checking. Now CLIENTS is always set.

Lustre-change: https://review.whamcloud.com/45994
Lustre-commit: f8e56a25cfc3e1f5af52e616bac950a5fb90ea40

Change-Id: I9a5d4b9bde401e14e1d7f6f88b04c8d1c6aea11a
Signed-off-by: Xinliang Liu <xinliang.liu@linaro.org>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50627
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 months agoLU-9859 libcfs: add "default" keyword for debug mask 72/49872/2
Andreas Dilger [Fri, 21 Jan 2022 07:20:56 +0000 (00:20 -0700)]
LU-9859 libcfs: add "default" keyword for debug mask

Allow "lctl set_param debug=default" to reset the debug mask to
the default value.  This is useful if the debug needs to be set
to a higher value temporarily, but should be easily reset back
to the original value afterward.

Lustre-change: https://review.whamcloud.com/46251
Lustre-commit: 4c9a5762413638cc630b1facfb565dcd765fce1e

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I7d0d8fb81e51afb5ea6f29abea0d0814de3ebbe5
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49872
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 months agoLU-16369 ldiskfs: do not check enc context at lookup 95/49895/2
Sebastien Buisson [Tue, 6 Dec 2022 16:36:02 +0000 (17:36 +0100)]
LU-16369 ldiskfs: do not check enc context at lookup

On rhel8, ldiskfs should not check for encryption context of inodes
upon lookup. On these kernels, ext4 is not encryption aware, so just
assume context is fine when target is mounted as ldiskfs.

Lustre-change: https://review.whamcloud.com/49324
Lustre-commit: 540c293a4d0fc80253670b3db8d6722da43284ad

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I9f9813d290ea24b34f710e2c8219e856ca8fbc58
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49895
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 months agoLU-16025 llite: adjust read count as file got truncated 89/50689/2
Bobi Jam [Thu, 7 Jul 2022 07:38:54 +0000 (15:38 +0800)]
LU-16025 llite: adjust read count as file got truncated

File read will not notice the file size truncate by another node,
and continue to read 0 filled pages beyond the new file size.

This patch add a confinement in the read to prevent the issue and
add a test case verifying the fix.

Lustre-change: https://review.whamcloud.com/47896
Lustre-commit: 4468f6c9d92448cb72c5a616ec74653e83ee8e10

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Ie51ba09201a1ca1464c3a3892d367590e978ee34
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <farr0186@gmail.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50689
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Zhenyu Xu <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 months agoLU-16286 ldiskfs: reimplement nodelalloc optimization 21/50821/2
Andrew Perepechko [Mon, 1 May 2023 17:02:54 +0000 (10:02 -0700)]
LU-16286 ldiskfs: reimplement nodelalloc optimization

fiemap calls perform costly delayed extent search affecting
BRW performance, however, in Lustre we don't use delayed
allocation at all. Let's skip this search completely as we did
in RHEL7.

Lustre-change: https://review.whamcloud.com/49007
Lustre-commit: 3dd73b5c5d61a219c702873711055cb1cc80394a

LU-16286 ldiskfs: add ext4_find_delayed_extent patch to more series

Add rhel8.4/ext4-optimize-find_delayed_extent.patch to RHEL 8.7
and RHEL 8.8 ldiskfs patch series.

Test-Parameters: trivial clientdistro=el8.6 serverdistro=el8.6 testlist=sanity
Test-Parameters: trivial clientdistro=el8.7 serverdistro=el8.7 testlist=sanity
Test-Parameters: trivial clientdistro=el8.8 serverdistro=el8.8 testlist=sanity

Change-Id: I2c3562cf5cbdf3c5532e4b79b28a040a995322b7
Signed-off-by: Andrew Perepechko <andrew.perepechko@hpe.com>
HPE-bug-id: LUS-11161
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50821
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 months agoLU-16659 build: Detect the mofed path based on running kernel 18/50518/4
Gaurang Tapase [Tue, 4 Apr 2023 05:45:35 +0000 (11:15 +0530)]
LU-16659 build: Detect the mofed path based on running kernel

Lustre-change: https://review.whamcloud.com/50517
Lustre-commit: 99172e355fbb37d2ba671d08ce0370ac0e8ae971
Test-Parameters: trivial

Change-Id: I519e93e8c26807da6143e2cf4d825ccf4a4180e4
Signed-off-by: Gaurang Tapase <gtapase@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50518
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
12 months agoLU-16756 kernel: new kernel [RHEL 9.2 5.14.0-283.el9] 52/50752/6
Jian Yu [Wed, 26 Apr 2023 01:29:20 +0000 (18:29 -0700)]
LU-16756 kernel: new kernel [RHEL 9.2 5.14.0-283.el9]

This patch makes changes to support new RHEL 9.2 release
for Lustre client.

Lustre-change: https://review.whamcloud.com/50745
Lustre-commit: dd390cd315f505456e8e8a29e16a0f24b6878e27
Test-Parameters: trivial env=SANITY_EXCEPT="27J 101j" clientdistro=el9.2 testlist=sanity

Change-Id: I4886bbf30d6d6a93c4adbfb68871e9d91f5b64de
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50752
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
12 months agoLU-16650 kernel: update RHEL 7.9 [3.10.0-1160.88.1.el7] 54/50554/4
Jian Yu [Thu, 6 Apr 2023 06:47:15 +0000 (23:47 -0700)]
LU-16650 kernel: update RHEL 7.9 [3.10.0-1160.88.1.el7]

Update RHEL 7.9 kernel to 3.10.0-1160.88.1.el7.

Lustre-change: https://review.whamcloud.com/50553
Lustre-commit: bd0d79456b91db58a75eeb717c7805d78d8a9a1a

Test-Parameters: trivial clientdistro=el7.9 serverdistro=el7.9

Change-Id: I4119595943940cca94d1853b59c94a02fed8cb71
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50554
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
12 months agoLU-16373 tests: failover mds1 back to the primary server 18/49418/3
Jian Yu [Thu, 15 Dec 2022 19:31:50 +0000 (11:31 -0800)]
LU-16373 tests: failover mds1 back to the primary server

This patch fixes recovery-small test 144a to failover
mds1 back to the primary server so that stack_trap can
set timeout parameter on the correct mds node.

Lustre-change: https://review.whamcloud.com/49345
Lustre-commit: d6411c87a98be0a7e8b7460bf537c6502b6daeca
Test-Parameters: trivial \
env=SLOW=yes,FAILURE_MODE=HARD,ONLY=144a \
clientcount=4 mdtcount=1 mdscount=2 osscount=2 \
austeroptions=-R failover=true iscsi=1 \
testlist=recovery-small

Change-Id: Idbfdb7b084c7edac8784008e0455f76632aa685b
Test-Parameters: trivial testlist=recovery-small
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49418
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
12 months agoLU-16676 build: always include llcrypt sources 80/50580/2
Sebastien Buisson [Fri, 7 Apr 2023 21:51:03 +0000 (14:51 -0700)]
LU-16676 build: always include llcrypt sources

llcrypt sources should always be included in source packages.
Binary build will decide whether to include llcrypt in the build
objects or not.

Lustre-change: https://review.whamcloud.com/50446
Lustre-commit: 869dae0f37ce0f6999b1a6348c8e594b53ba56d9

Test-Parameters: trivial
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I726d7deb27687bffebce55f6c09d578e6290aac7
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50580
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 months agoLU-14958 kernel: use rhashtable for revoke records in jbd2 30/50730/2
Alex Zhuravlev [Mon, 24 Apr 2023 19:56:51 +0000 (12:56 -0700)]
LU-14958 kernel: use rhashtable for revoke records in jbd2

resizable hashtable should improve journal replay time when
the latter has got million of revoke records. notice that
rhashtable is used during replay only as removal with list_del()
is less expensive and it's used a lot during regular processing.

before:
1048576 records - 95 seconds
2097152 records - 580 seconds

after:
1048576 records - 2 seconds
2097152 records - 3 seconds
4194304 records - 7 seconds

Lustre-change: https://review.whamcloud.com/45122
Lustre-commit: c3bb2b778d6b40a5cecb01993b55fcc107305b4a

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I9a9e3801223fa9e36cbf6d2ef5ddbad5dff3e19d
Reviewed-by: jsimmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50730
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 months agoLU-16755 kernel: RHEL 8.8 client and server support 29/50729/4
Jian Yu [Mon, 24 Apr 2023 20:02:42 +0000 (13:02 -0700)]
LU-16755 kernel: RHEL 8.8 client and server support

This patch makes changes to support RHEL 8.8 release
with kernel 4.18.0-477.el8 for Lustre client and server.

Lustre-change: https://review.whamcloud.com/50708
Lustre-commit: TBD (from 401fbba442e365a7c79f5526f44ebac7e6a2fc07)

Test-Parameters: trivial fstype=ldiskfs \
clientdistro=el8.8 serverdistro=el8.8 testlist=sanity

Test-Parameters: trivial fstype=zfs \
clientdistro=el8.8 serverdistro=el8.8 testlist=sanity

Change-Id: Ie47f131e0340a601c8a5d748ecf9b1b73d4baa1f
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50729
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 months agoLU-16510 build: include unsafe_memcpy definition 67/50667/2
Patrick Farrell [Tue, 18 Apr 2023 06:12:53 +0000 (23:12 -0700)]
LU-16510 build: include unsafe_memcpy definition

The original LU-16510 missed a key part of the
unsafe_memcpy code from the upstream kernel, and so we
weren't actually defining unsafe_memcpy() as intended.

Thanks to Aurelien Degremont <adegremont@nvidia.com> for
pointing this out.

Lustre-change: https://review.whamcloud.com/50573
Lustre-commit: 565b21bf65e385a9b4fd8ee31cabe7892345b783

Fixes: 919b93b9 ("LU-16510 build: fortified memcpy from linux 6.1")
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ib9e2d56ed0b3691f1ab9fcd25403fa86ac784b6d
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Aurelien Degremont <adegremont@nvidia.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50667
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 months agoLU-16509 lnet: quash memcpy WARN_ONCE false positive 47/50647/2
Shaun Tancheff [Fri, 27 Jan 2023 07:08:23 +0000 (01:08 -0600)]
LU-16509 lnet: quash memcpy WARN_ONCE false positive

Linux v6.1-rc1-4-g6f7630b1b5bc
  fortify: Capture __bos() results in const temp var

In lnet_peer_push_event() the memcpy triggers a WARN_ONCE
due to the flexible array at the end of
struct lnet_ping_info contained in struct lnet_ping_buffer

Use unsafe_memcpy() to avoid this false positive warning.

Lustre-change: https://review.whamcloud.com/49801
Lustre-commit: a3cf8587b6fafdfed16fe0870efddcf6c0746c88

Test-Parameters: trivial
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I4aa8f38678cd1522004d98b58a3f440d8a38589c
Signed-off-by: Stephane Thiell <sthiell@stanford.edu>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50647
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 months agoLU-15404 ldiskfs: use per-filesystem workqueues to avoid deadlocks 86/50586/2
Andrew Perepechko [Tue, 21 Mar 2023 12:30:58 +0000 (08:30 -0400)]
LU-15404 ldiskfs: use per-filesystem workqueues to avoid deadlocks

Calling flush_scheduled_work() under s_umount is dangerous and may
cause deadlocks. This patch backports the fix from
https://lore.kernel.org/all/20220402084023.1841375-1-anserper@ya.ru/

Lustre-change: https://review.whamcloud.com/50354
Lustre-commit: 616fa9b581798e1b66e4d36113c29531ad7e41a0

Fixes: e239a14001 ("LU-15404 ldiskfs: truncate during setxattr leads to kernel panic")
Signed-off-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Change-Id: Ia191b70166f94f34e96a282ec18bd8650871e108
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50586
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 months agoLU-16413 osd-ldiskfs: fix T10PI for CentOS 8.x 14/50514/3
Li Dongyang [Mon, 19 Dec 2022 10:03:47 +0000 (21:03 +1100)]
LU-16413 osd-ldiskfs: fix T10PI for CentOS 8.x

Recreate the currently broken lustre kernel patches
to allow using custom integrity functions for bio.
Note we don't need to save the generate_fn anymore,
it will be used once we call bio_integrity_prep_fn().

Add upstream fix
b13e0c718568 ("block: bio-integrity: Advance seed correctly
for larger interval sizes") for CentOS 8.0 to 8.6.

Handle the kernel api changes for the T10PI generate and
verify functions introduced in CentOS 8.x kernel,
mostly because of switching to blk_integrity_iter.

Update the custom generate and verify functions, to sync
with upstream versions.
- Add T10-DIF-TYPE2, currently only a place holder,
  not used in upstream either.
- Use __be16 instead of __u16 for guard tags.

Only reuse guard tags if the rpc checksum is the same
one supported on the target. We already have some protection
during checksum type negotiation, the server
will mark the target's T10PI type as the only
T10PI checksum type supported. But it's still good to
have the logic in place.

Do not call bio_integrity_prep() if the custom interface
bio_integrity_prep_fn() does not exist, submit_bio() will
do that for us.

On the servers, show the target's T10PI checksum as
the preferred checksum_type even if it's not the fastest.
Note this is only cosmetic and does not impact the checksum
type used, which is still done during negotiation.

Lustre-change: https://review.whamcloud.com/49441
Lustre-commit: 4f0273b3bc7d2159d255ea8ce8ec1804fa67bfd8

Change-Id: I2d0ba0b80ba9cde2977da24db08095671aa5373c
Test-Parameters: trivial
Fixes: 293844d132 ("LU-16222 kernel: RHEL 8.7 client and server support")
Fixes: f176efd183 ("LU-12269 kernel: RHEL 8.0 server support")
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50514
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 months agoLU-16589 tests: fix hard-link failure in sanityn/55d 48/50348/3
Jian Yu [Tue, 21 Mar 2023 05:11:21 +0000 (22:11 -0700)]
LU-16589 tests: fix hard-link failure in sanityn/55d

Since coreutils version 8.31, the stat() and lstat()
operations were removed from ln by commit 571f63f5010b,
which caused the following dir hard-link failure in
sanityn/55d:

ln: failed to create hard link '/mnt/lustre2/d55d.sanityn/d55d.sanityn/'
=> '/mnt/lustre2/d55d.sanityn/f1': No such file or directory

This actually reveals a kernel issue which is fixed by commit
v5.18-rc2-188-gb3d4650d82c7.

To avoid the kernel issue and keep the test effective,
this patch appends the target filename to the $tdir/
so as to fix the hard-link failure.

Lustre-change: https://review.whamcloud.com/50127
Lustre-commit: 25c6b7ad2859729197c3cc6e6dcf0621e4bda6fa

Test-Parameters: trivial env=ONLY=55d testlist=sanityn
Test-Parameters: trivial clientdistro=el9.0 env=ONLY=55d testlist=sanityn
Test-Parameters: trivial clientdistro=sles15sp4 env=ONLY=55d testlist=sanityn
Test-Parameters: trivial clientdistro=sles15sp3 env=ONLY=55d testlist=sanityn

Change-Id: I42313e43eaea3d94007d534bf38efdeacf2ede43
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50348
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
12 months agoLU-15935 tests: add version check to replay-dual test_33 00/49400/3
Jian Yu [Wed, 14 Dec 2022 02:02:47 +0000 (18:02 -0800)]
LU-15935 tests: add version check to replay-dual test_33

This patch adds MDS version check to replay-dual test_33
to avoid interop test failure.

Lustre-change: https://review.whamcloud.com/49398
Lustre-commit: 92c639769d195bf3ff8c3e77c093338ac6de5e2e

Test-Parameters: trivial \
serverjob=lustre-b2_15 serverbuildno=28 \
env=ONLY=33 testlist=replay-dual

Test-Parameters: trivial env=ONLY=33 testlist=replay-dual

Change-Id: I3ec665302a431d3c0f07bc819a08237dbc5b4309
Fixes: 1a79d395dd ("LU-15935 target: keep track of multirpc slots in last_rcvd")
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49400
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
12 months agoLU-15935 target: keep track of multirpc slots in last_rcvd 99/49399/2
Etienne AUJAMES [Wed, 14 Dec 2022 01:46:19 +0000 (17:46 -0800)]
LU-15935 target: keep track of multirpc slots in last_rcvd

OBD_INCOMPAT_MULTI_RPCS is cleared by tgt_boot_epoch_update() if the
recovery is aborted. This supposes that all the clients are evicted
but that is not true. Some clients could have successfully finished
their recovery. In that case, those clients will keep their last_rcvd
slot.

This patch modifies lut_num_client to keep track of multirpc
slots in last_rcvd.
For now the counter is use only by tgt_fini() to clear
OBD_INCOMPAT_MULTI_RPCS. So we can expand this use case for
tgt_boot_epoch_update().

Add replay-dual test_33.

Lustre-change: https://review.whamcloud.com/48082
Lustre-commit: 1a79d395dd61ea2e21598bfaa5b39375e64ec22c

Test-Parameters: testlist=replay-dual env=ONLY=33,ONLY_REPEAT=30
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Change-Id: I70791c9dcb7cc77f018b9e5c95568598d54f0322
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49399
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 months agoLU-15000 llog: read canceled records in llog_backup 98/48898/3
Etienne AUJAMES [Mon, 17 Oct 2022 23:10:34 +0000 (16:10 -0700)]
LU-15000 llog: read canceled records in llog_backup

llog_backup() do not reproduce index "holes" in the generated copy.
This could result to a llog copy indexes different from the source.
Then it might confuse the configuration update mechanism that rely on
indexes between the MGS source and the target copy.

This index gaps can be caused by "lctl --device MGS llog_cancel".

This patch add "raw" read mode to llog_process* to read canceled
records. So now llog_backup is able to reproduce an exact copy of
the original.

Lustre-change: https://review.whamcloud.com/46552
Lustre-commit: d8e2723b4e9409954846939026c599b0b1170e6e

Signed-off-by: Etienne AUJAMES <etienne.aujames@cea.fr>
Change-Id: I811e23de8f4545bed36a44fedc2638d7418830dd
Reviewed-by: Dominique Martinet <qhufhnrynczannqp.f@noclue.notk.org>
Reviewed-by: DELBARY Gael <gael.delbary@cea.fr>
Reviewed-by: Stephane Thiell <sthiell@stanford.edu>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48898
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-by: Dominique Martinet <asmadeus@codewreck.org>
12 months agoLU-15357 iokit: fix the obsolete usage of cfg_device 66/49566/2
Hongchao Zhang [Fri, 6 Jan 2023 07:13:46 +0000 (23:13 -0800)]
LU-15357 iokit: fix the obsolete usage of cfg_device

The LCTL command "cfg_device" is obsolete and some operations
(such as "cleanup", "detach") don't support it anymore.
In mds_survey and lfsck-performance it causes the echo client
device not to be destroyed and causes LBUG when umounting the
related Lustre device.

Lustre-change: https://review.whamcloud.com/45872
Lustre-commit: a20b78a81d091cebd6b9e6c87537b2c955084cd5

Change-Id: If7f6eff080906e395023289652fcd2a78dfb6fb7
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Colin Faber <cfaber@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49566
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 months agoLU-16670 enc: make sure DoM files are correctly decrypted 30/50430/5
Sebastien Buisson [Mon, 27 Mar 2023 08:46:07 +0000 (10:46 +0200)]
LU-16670 enc: make sure DoM files are correctly decrypted

Make sure DoM files are decrypted upon read by loading their
associated encryption context, via llcrypt_prepare_readdir()/
llcrypt_get_encryption_info().

Fix sanity-sec test_50 accordingly.

Lustre-change: https://review.whamcloud.com/50429
Lustre-commit: 1c424252d37c64e3c223c19dced3cad2649c1f61

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Ie9ef3cbb08d2295a2fd10b9e9ab0862119c7723e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50430
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
13 months agoLU-16524 nodemap: filter out unknown records 30/50230/2
Sebastien Buisson [Wed, 8 Mar 2023 10:26:38 +0000 (11:26 +0100)]
LU-16524 nodemap: filter out unknown records

Ignore records of type NODEMAP_CLUSTER_IDX or NODEMAP_GLOBAL_IDX if
their subtype is not known. It would come from an upgraded server on
which new nodemap properties/entries would be set, and then downgraded
back to an older version.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I7e349fde7fc927b23500abb51d4aed91f938f8d1
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50230
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
13 months agoLU-16658 tests: disable performance-sanity test_6 25/50425/3
Andreas Dilger [Mon, 27 Mar 2023 07:45:14 +0000 (00:45 -0700)]
LU-16658 tests: disable performance-sanity test_6

This test is likely failing due to a bug in mdsrate, which is no
longer actively developed.  It should be replaced by mdtest.

Lustre-change: https://review.whamcloud.com/50386
Lustre-commit: 5e4897eb6f1c97d4f0120803780904db49c5abe7

Test-Parameters: trivial testlist=performance-sanity
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I05378fb75ed30e56983f4668c03725824ad5a8ab
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50425
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
13 months agoLU-15925 lnet: add debug messages for IB 08/50408/2
Cyril Bordage [Thu, 9 Jun 2022 21:41:54 +0000 (23:41 +0200)]
LU-15925 lnet: add debug messages for IB

If net debug is enabled, information about connection, when
tx status is ECONNABORTED, is collected (only for IB).

Lustre-change: https://review.whamcloud.com/47583
Lustre-commit: 9153049bdc7ec8217691481df64551e2768455a9

Test-Parameters: trivial
Signed-off-by: Cyril Bordage <cbordage@whamcloud.com>
Change-Id: I44a33703931630b85cc0e847e2a038217b7967c6
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50408
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 months agoLU-16579 llite: fix the wrong beyond read end calculation 78/50278/3
Qian Yingjin [Mon, 20 Feb 2023 03:11:54 +0000 (22:11 -0500)]
LU-16579 llite: fix the wrong beyond read end calculation

During the test, we found a dead loop in the read path which
retruns AOP_TRUNCATED_PAGE(0x8001) endless.
The reason is that the calculation of the ending beyond offset is
wrong: (iter->count + iocb->ki_pos).
The ending beyond offset was supposed to be not changed during
the read I/O loop for each page in buffered I/O mode.
However, @iter->count is decreased with read bytes when finished
the read of each page: @iter->count -= read_bytes.

In this patch, we store the ending beyond page index in
@lcc->lcc_end_index before call @generic_file_read_iter into a
loop for each read page and solve this bug.

Lustre-change: https://review.whamcloud.com/50065
Lustre-commit: ae356dc325877bd130ad94acc5f3610898de8a8a

Fixes: 2f8f38effa ("LU-16412 llite: check read page past requested")
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I5bb7ab82e5e2de8b9bd911798fb8ae65fc7c91af
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50278
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
13 months agoLU-16412 llite: check read page past requested 77/50277/2
Qian Yingjin [Fri, 20 Jan 2023 17:30:27 +0000 (12:30 -0500)]
LU-16412 llite: check read page past requested

Due to a kernel bug introduced in 5.12 in commit:
cbd59c48ae2bcadc4a7599c29cf32fd3f9b78251
("mm/filemap: use head pages in generic_file_buffered_read")
if the page immediately after the current read is in cache,
the kernel will try to read it.

This attempts to read a page past the end of requested
read from userspace, and so has not been safely locked by
Lustre.

For a page after the end of the current read, check wether
it is under the protection of a DLM lock. If so, we take a
reference on the DLM lock until the page read has finished
and then release the reference.  If the page is not covered
by a DLM lock, then we are racing with the page being
removed from Lustre.  In that case, we return
AOP_TRUNCATED_PAGE, which makes the kernel release its
reference on the page and retry the page read.  This allows
the page to be removed from cache, so the kernel will not
find it and incorrectly attempt to read it again.

NB: Earlier versions of this description refer to stripe
boundaries, but the locking issue can occur whether or
not the page is on a stripe boundary, because dlmlocks
can cover part of a stripe.  (This is rare, but is
allowed.)

Lustre-change: https://review.whamcloud.com/49723
Lustre-change: 2f8f38effac3a95199cdcdbd4854f958cdb0c72c

Change-Id: Ib93bd0624fda0ed1c2b89f609d15208c86e21c29
Signed-off-by: Qian Yingjin <qian@ddn.com>
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Zhenyu Xu <bobijam@hotmail.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50277
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 months agoLU-16349 o2iblnd: Fix key mismatch issue 14/50214/2
Dean Luick [Thu, 19 Jan 2023 20:38:04 +0000 (21:38 +0100)]
LU-16349 o2iblnd: Fix key mismatch issue

If a pool memory region (mr) is mapped then unmapped without being
used, its key becomes out of sync with the RDMA subsystem.

At pool mr map time, the present code will create a local
invalidate work request (wr) using the mr's present key and then
change the mr's key.  When the mr is first used after being mapped,
the local invalidate wr will invalidate the original mr key, and
then a fast register wr is used with the modified key.  The fast
register will update the RDMA subsystem's key for the mr.

The error occurs when the mr is never used.  The next time the mr
is mapped, a local invalidate wr will again be created, but this
time it will use the mr's modified key.  The RDMA subsystem never
saw the original local invalidate, so now the RDMA subsystem's
key for the mr and o2iblnd's key for the mr are out of sync.

Fix the issue by tracking if the invalidate has been used.
Repurpose the boolean frd->frd_valid.  Presently, frd_valid is
always false.  Remove the code that used frd_valid to conditionally
split the invalidate from the fast register.  Instead, use frd_valid
to indicate when a new invalidate needs to be generated.  After a
post, evaluate if the invalidate was successfully used in the post.

These changes are only meaningful to the FRWR code path.  The failure
has only been observed when using Omni-Path Architecture.

Lustre-change: https://review.whamcloud.com/49714
Lustre-commit: 0c93919f1375ce16d42ea13755ca6ffcc66b9969

Signed-off-by: Cyril Bordage <cbordage@whamcloud.com>
Signed-off-by: Xing Huang <hxing@ddn.com>
Change-Id: I532a11f10ae6a5917a4c054f37747d08eb4d6331
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50214
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 months agoLU-16557 client: -o network needs add_conn processing 87/50187/2
Mikhail Pershin [Mon, 13 Feb 2023 09:07:45 +0000 (12:07 +0300)]
LU-16557 client: -o network needs add_conn processing

Mount option -o network restricts client import to use
only selected network. It processes connection UUID/NIDs
during 'setup' config command handling but skips any
'add_conn' command if its UUID has no mention about that
network. Meahwhile connection UUID is just a name and may
have many NIDs configured including those on restricted
network which are skipped as well. Therefore client import
configuration misses failover NIDs on restricted network.

Patch makes import to save restricted network information
after 'setup' command processing, so it is applied to any
client_import_add_conn() call. The 'add_conn' command is
always processed now and its NIDs will be filtered in the
same way as for 'setup'.
Test 31 in sanity-sec.sh is extended to check imports
failover_nids has all and only NIDs on restricted network

Lustre-change: https://review.whamcloud.com/49986
Lustre-commit: c508c9426838f16256223ab0bbd648bfbec25e46

Test-Parameters: env=ONLY=31 testlist=sanity-sec
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: Id70ebd836f061f154e3779b07b52f1baea9a1776
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50187
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 months agoLU-16160 llite: SIGBUS is possible on a race with page reclaim 02/50202/2
Andrew Perepechko [Fri, 3 Mar 2023 22:11:50 +0000 (17:11 -0500)]
LU-16160 llite: SIGBUS is possible on a race with page reclaim

We can restart fault handling if page truncation happens
in parallel with the fault handler.

Lustre-commit: I6e60783e3334f87e799dc8b0e6e63d0bb358a236
Lustre-change: https://review.whamcloud.com/49647

Also included sanityn test from:
LU-16160 llite: clear stale page's uptodate bit
5b911e03261c3de6b0c2934c86dd191f01af4f2f
https://review.whamcloud.com/48607

Change-Id: I6e60783e3334f87e799dc8b0e6e63d0bb358a236
Signed-off-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Signed-off-by: Patrick Farrell <farr0186@gmail.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50202
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 months agoLU-15860 socklnd: Duplicate ksock_conn_cb 11/48911/2
Chris Horn [Thu, 12 May 2022 18:16:10 +0000 (13:16 -0500)]
LU-15860 socklnd: Duplicate ksock_conn_cb

If two threads enter ksocknal_add_peer(), the first one to acquire
the ksnd_global_lock will create a ksock_peer_ni and associate a
ksock_conn_cb with it.

When the second thread acquires the ksnd_global_lock it will find the
existing ksock_peer_ni, but it does not check for an existing
ksock_conn_cb. As a result, it overwrites the existing ksock_conn_cb
(ksock_peer_ni::ksnp_conn_cb) and the ksock_conn_cb from the first
thread becomes stranded.

Modify ksocknal_add_peer() to check whether the peer_ni has an
existing ksock_conn_cb associated with it

Lustre-change: https://review.whamcloud.com/47361
Lustre-commit: 0c91d49a44e1214b5c65d4a557f6969b3d217881

Fixes: 7766f01e89 ("LU-13641 socklnd: replace route construct")
HPE-bug-id: LUS-10956
Test-Parameters: trivial
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I6c0190a0c1d3321ddd85c763b86ad1f0d32cf2b9
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andriy Skulysh <andriy.skulysh@hpe.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48911

13 months agoLU-16376 obdclass: NUL terminate long jobid strings 90/49490/2
Andreas Dilger [Thu, 8 Dec 2022 18:43:57 +0000 (11:43 -0700)]
LU-16376 obdclass: NUL terminate long jobid strings

It appears that some jobid names can be sent that are using the full
32-byte size, rather than containing an embedded NUL terminator. This
caused errors in lprocfs_job_stats_log() when it overflowed.

If there is no NUL terminator in lustre_msg_get_jobid() then add one
if not found within the buffer, so that the rest of the code doesn't
have to deal with unterminated strings.

This potentially exposes a larger issue that other places may not be
handling the unterminated string properly either, which needs to be
addressed separately on both the client and server.  Terminating the
jobid to 31 chars only on the client does not totally solve the issue,
since there will still be older clients that are not doing this, so
the server needs to handle this in any case.

Lustre-change: https://review.whamcloud.com/49351
Lustre-commit: 9eba5d57297f807fddf046356c846478bbf232f4

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I4c05fabdacb6a0bbf6477d3601a628fe1f3ebbe5
Reviewed-by: Feng Lei <flei@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49490
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 months agoLU-16374 enc: align Base64 encoding with RFC 4648 base64url 45/49945/2
Sebastien Buisson [Sun, 18 Jul 2021 00:01:25 +0000 (19:01 -0500)]
LU-16374 enc: align Base64 encoding with RFC 4648 base64url

Lustre encryption uses a Base64 encoding to encode no-key filenames
(the filenames that are presented to userspace when a directory is
listed without its encryption key).
Make this Base64 encoding compliant with RFC 4648 base64url. And use
'+' leading character to distringuish digested names.

This is adapted from kernel commit
ba47b515f594 fscrypt: align Base64 encoding with RFC 4648 base64url

To maintain compatibility with older clients, a new llite parameter
named 'filename_enc_use_old_base64' is introduced, set to 1 by
default for Lustre 2.15.
When 0, Lustre uses new-fashion base64 encoding. When set to
1, Lustre uses old-style base64 encoding.

To set this parameter globally for all clients, do on the MGS:
mgs# lctl set_param -P llite.*.filename_enc_use_old_base64={0,1}

Lustre-change: https://review.whamcloud.com/49581
Lustre-commit: 583ee6911b6cac7f2867a37101cc069b4011b73f

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Iaa2256da7fb591d842b5bb7aa474b2ee6de9899d
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: jsimmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49945
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 months agoLU-16444 enc: null-enc names cannot be digested form 51/49551/5
Sebastien Buisson [Wed, 4 Jan 2023 15:10:02 +0000 (16:10 +0100)]
LU-16444 enc: null-enc names cannot be digested form

When encrypted files have their names encrypted, long names are in
digested form in case access is done without the encryption key. The
digest is base64-encoded, and prepended with '_'.
With null encryption for file names, names are always plain text. In
this case, a legitimate '_' at the start of a name must not be
interpreted as a digested form.

sanity-sec test_54 is improved to test the case of a file whose name
starts with '_'.

Lustre-change: https://review.whamcloud.com/49550
Lustre-commit: a0132a79df9b59d5d9b674665daf6cdbd79128a8

Fixes: f18c87cb53 ("LU-13717 sec: handle null algo for filename encryption")
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Idaad186afd06cfbabbe1d13e78f083d12876c8ff
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49551
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
13 months agoLU-16452 tests: skip interop recovery-small/144a 81/49681/5
Andreas Dilger [Wed, 18 Jan 2023 19:31:29 +0000 (12:31 -0700)]
LU-16452 tests: skip interop recovery-small/144a

Skip recovery-small test_144a for MDS < 2.15.1
since the fix and its corresponding test were added there.

Fixes: aa6250b741 ("LU-15724 tests: MDT failover hang reproducer")
Test-Parameters: trivial testlist=recovery-small env=ONLY=144 serverversion=2.14.0
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I77bfdf55d0218aa9e252f742cc90f1c61216d506
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49681
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Reviewed-by: Sarah Liu <sarah@whamcloud.com>
13 months agoLU-16456 tests: skip conf-sanity test_129/132 in interop 02/49602/3
Andreas Dilger [Wed, 11 Jan 2023 19:02:04 +0000 (12:02 -0700)]
LU-16456 tests: skip conf-sanity test_129/132 in interop

test_129 was added in commit v2_14_56-40-gcefabee52
test_132 was added in commit v2_14_56-96-ge26d7cc39
They should be skipped for older MDS versions.

Lustre-change: https://review.whamcloud.com/49601
Lustre-commit: 7e566c6a1f9d5324718ebc7149153f3272363b9c

Test-Parameters: trivial testlist=conf-sanity env=ONLY=122-133 serverversion=2.14.0
Fixes: cefabee52 ("LU-15112 mgc: do not ignore target registration failure")
Fixes: e26d7cc399 ("LU-14399 hsm: process hsm_actions in coordinator")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: If1e276c816ecf2f30dc970f9b5afe85d722540e5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49602
Reviewed-by: Sarah Liu <sarah@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
13 months agoLU-16125 tests: make sanity-sec more robust with SSK 93/49893/2
Sebastien Buisson [Tue, 30 Aug 2022 09:22:34 +0000 (11:22 +0200)]
LU-16125 tests: make sanity-sec more robust with SSK

Encryption related tests in sanity-sec carry out unmount and mount of
clients in order to exercise code with and without the encryption key.
In case SSK is in use, we need to make sure flavors are properly
applied before carrying on.

Lustre-change: https://review.whamcloud.com/48386
Lustre-commit: bee889e87584aa3bd2e6819db73d6adf129460ee

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I92e85dc6dcef43f70a7fe05db94cd18fe66a3a24
Reviewed-by: Zhenyu Xu <bobijam@hotmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49893
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 months agoLU-14692 tests: allow FID_SEQ_NORMAL for MDT0000 54/49754/2
Li Dongyang [Tue, 25 Jan 2022 00:53:33 +0000 (11:53 +1100)]
LU-14692 tests: allow FID_SEQ_NORMAL for MDT0000

Fix the tests asssuming objects created for MDT0000
always have a seq number of 0, to prepare for
deprecating IDIF sequence.

Fix sanity test_312 on ZFS to properly identify which
OST the object was created on, and re-enable it.

Lustre-change: https://review.whamcloud.com/46293
Lustre-commit: eaae4655567b16260237764dadb7ab57df8b0edd

Test-Parameters: testlist=sanity env=ONLY="39r 312"
Test-Parameters: testlist=sanity-scrub env=ONLY=19
Test-Parameters: testlist=sanity-sec env=ONLY=37
Change-Id: I4bffabe25a6f84cdba760aabea1da3429715a283
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49754
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
13 months agoLU-16101 tests: add sanity/27J to always_except 71/49971/5
Jian Yu [Sun, 12 Feb 2023 07:56:54 +0000 (23:56 -0800)]
LU-16101 tests: add sanity/27J to always_except

This patch adds sanity/27J to always_except for SLES15 SP4
and 5.16.0+ kernels before the issue introduced by upstream
commit 8c8387ee3f55
("mm: stop filemap_read() from grabbing a superfluous page")
is resolved.

Lustre-change: https://review.whamcloud.com/49970
Lustre-commit: 63dd644747f4eab20d640b4d87060e56c20bc37f

Test-Parameters: trivial clientdistro=sles15sp4 \
env=SANITY_EXCEPT=101j testlist=sanity

Signed-off-by: Jian Yu <yujian@whamcloud.com>
Change-Id: Iafde656530fcdc1de9265aacaa9266435c9d5c47
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49971
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Xing Huang <hxing@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
13 months agoLU-14598 tests: skip conf-sanity test_122b in interop 86/49586/4
Andreas Dilger [Mon, 9 Jan 2023 23:27:46 +0000 (16:27 -0700)]
LU-14598 tests: skip conf-sanity test_122b in interop

Code was fixed in 2.15.0.

Lustre-change: https://review.whamcloud.com/49583
Lustre-commit: a9f83af896f2ce20e9ee430ad371b707c0c140cc

Test-Parameters: trivial testlist=conf-sanity env=ONLY=122 serverversion=2.14.0
Fixes: 747fed818b ("LU-14598 ofd: fix for IDIF sequence at ofd_preprw_write")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I6d9480f4b43706b597df6bd74c65959776cf2b5b
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49586
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Sarah Liu <sarah@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
13 months agoLU-16187 tests: Fix is_project_quota_supported() 49/49449/2
Arshad Hussain [Mon, 26 Sep 2022 09:31:41 +0000 (15:01 +0530)]
LU-16187 tests: Fix is_project_quota_supported()

is_project_quota_supported() is called from sanity-quota.sh
to verify if the ldiskfs FS $ENABLE_PROJECT_QUOTAS is true
and to verify if current version of lfs command supports
'project'.  To do this it calls 'lfs --help' which is
not supported. This patch moves 'lfs --help' call to
'lfs --list-commands' call to verify if the present
version of lfs supports 'project'

Lustre-change: https://review.whamcloud.com/48654
Lustre-commit: d4848d779bb8716c6df2fe5438fbe00997f87f3d

Test-Parameters: trivial testlist=sanity-quota
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: Iba7e6696d3fa9e980088f448ae72b07a4b47f4f2
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49449
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
13 months agoLU-16655 scrub: upgrade scrub_file from 2.12 format 80/50480/2
Alexander Zarochentsev [Tue, 28 Mar 2023 16:00:09 +0000 (19:00 +0300)]
LU-16655 scrub: upgrade scrub_file from 2.12 format

Scrub_file->sf_oi_count has different offsets in Lustre-2.10,
Lustre-2.12, and Lustre-2.15 due to unintended format changes.
Lustre-2.15 reads sf_oi_count from offset of sf_success_count
and may initialize incorrect number of OI files, and not be
able to do FID lookups for existing filesystem objects.

Fixes: a114f6b8c5 ("LU-13344 servers: change request timeouts to s32")
Fixes: 4c2f028a95 ("LU-9019 osd-ldiskfs: migrate to 64 bit time")
Signed-off-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Change-Id: Id7c8bd555229405d604456c48447f01fd121aca9
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50480
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
13 months agoLU-16221 kernel: new kernel [RHEL 9.1 5.14.0-162.18.1.el9_1] 24/49124/9
Jian Yu [Thu, 9 Mar 2023 06:10:43 +0000 (22:10 -0800)]
LU-16221 kernel: new kernel [RHEL 9.1 5.14.0-162.18.1.el9_1]

This patch makes changes to support new RHEL 9.1 release
for Lustre client.

Lustre-change: https://review.whamcloud.com/48938
Lustre-commit: a05d02ea0e43bc656b0c25b8cd821323857e6cc2

Test-Parameters: trivial clientdistro=el9.1 \
env=SANITY_EXCEPT=101j testlist=sanity

Test-Parameters: trivial serverdistro=el8.7 clientdistro=el9.1 \
env=SANITY_EXCEPT=101j testlist=sanity

Change-Id: I8af730f84c9ddf9dcb7e3ddfbd24a68173f51e8d
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49124
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
13 months agoLU-16510 build: fortified memcpy from linux 6.1 15/49815/5
Shaun Tancheff [Thu, 9 Feb 2023 03:50:05 +0000 (19:50 -0800)]
LU-16510 build: fortified memcpy from linux 6.1

The fortified memcpy() from Linux v5.11-11104-ga28a6e860c6c
through v5.18-rc5-1405-g43213daed6d6 incorrectly reports
a false positive out of bounds check.

In function 'memcpy' ...
  '__read_overflow2' declared with attribute error: detected
   read beyond size of object passed as 2nd parameter

Lustre-change: https://review.whamcloud.com/49811
Lustre-commit: 919b93b951d4a9aa0400b9c882a1f68b79d8f118

Test-Parameters: trivial
HPE-bug-id: LUS-11459
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I3a59d8b647833c05ff4b51e327ed8bce894141fe
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49815
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 months agoLU-13485 libcfs: Parallel configure tests for libcfs 92/49092/7
Shaun Tancheff [Thu, 9 Mar 2023 05:41:15 +0000 (21:41 -0800)]
LU-13485 libcfs: Parallel configure tests for libcfs

Transform the compile tests in libcfs to run in parallel

Lustre-change: https://review.whamcloud.com/38349
Lustre-commit: 182fa9be075f5866aba2f37fbc3434cd0292ac0e

Test-Parameters: trivial
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I76ab65558dd456dc08d6ef4a1985455ce1f17913
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49092
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 months agoLU-12275 sec: remove bio functions in fscrypt compat 40/50140/3
Andreas Dilger [Wed, 8 Mar 2023 20:36:46 +0000 (12:36 -0800)]
LU-12275 sec: remove bio functions in fscrypt compat

Remove libcfs/llibcfs/crypto/bio.c since direct block device access
is not needed for client builds, and the use of stuct bio on the
client adds unnecessary complexity to portability.

Lustre-change: https://review.whamcloud.com/50023
Lustre-commit: d328818a456daf30c20c8df0aa0be9dd2a2b6a9e

Test-Parameters: trivial
Fixes: a813e8187 ("LU-12275 sec: add llcrypt as file encryption library")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I97642dfd85053b9ea4196374f2002ffb6a2540e5
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50140
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 months agoLU-16210 llite: replace selinux_is_enabled() 30/49630/9
Etienne AUJAMES [Fri, 10 Feb 2023 01:22:56 +0000 (17:22 -0800)]
LU-16210 llite: replace selinux_is_enabled()

selinux_is_enabled() was removed from kernel 5.1.
The commit 39e5bfa add the kernel support by assuming SELinux to be
enabled if the function selinux_is_enabled() does not exist.

This has performances impacts: on older kernel (e.g: Centos7) getxattr
RPCs was not send for "security.selinux" if selinux was disabled.
Utilities like "ls -l" always try to get "security.selinux".
See the LU-549 for more information.

This patch uses security_inode_listsecurity() when mounting the
client to know if a LSM module (selinux) required a xattr to store
file contexts. If a xattr is returned we store it and use it for in
request security context.

For getxattr/setxattr we use the stored LSM's xattr to filter xattr
security contexts like security.selinux. If xattr does not match the
stored xattr name we returned -EOPNOTSUPP to userspace.

It adds also the s_security check for security_inode_notifysecctx() to
avoid calling this function if selinux is disabled (as in
nfs_setsecurity()).

For "Enforcing SELinux Policy Check" functionnality, the selinux check
have been moved in l_getsepol: -ENODEV is returned if selinux is
disabled.

Add a regresion test "sanity test_434" for this use case.

*Note:*
This patch detects that selinux is disabled without explicitly
disabled it in kernel cmdline. This is recommended for RHEL >= 8.5.

*Performances:*
Tests with "strace -c ls -l" with 100000 files on root in a multi VMs
env (on Rocky 9). FS is remount for each tests (cache is cleaned) and
selinux is disabled.
 __________________ ___________ _________
| Total time %     | lgetxattr | statx   |
|__________________|___________|_________|
|Without the patch:|    29%    |   51%   |
|__________________|___________|_________|
|With the patch:   |    0%     |   87%   |
|__________________|___________|_________|
"ls -l" uses lgetxattr to get "security.selinux".

Linux-commit: 3d252529480c68bfd6a6774652df7c8968b28e41

Lustre-change: https://review.whamcloud.com/48875
Lustre-commit: 1d8faaf6caf4acaf0e2d4943b51c024a96c80624

Fixes: 39e5bfa ("LU-12355 llite: include file linux/selinux.h removed")
Fixes: 9bcac0b ("LU-549 llite: Improve statfs performance if selinux is disabled")
Test-Parameters: clientselinux=false clientdistro=el7.9 testlist=sanity env=ONLY=434,ONLY_REPEAT=20
Test-Parameters: clientselinux=false clientdistro=el8.5 testlist=sanity env=ONLY=434,ONLY_REPEAT=20
Test-Parameters: clientselinux=false clientdistro=el8.6 testlist=sanity env=ONLY=434,ONLY_REPEAT=20
Test-Parameters: clientselinux clientdistro=el8.6 testlist=sanity-selinux
Test-Parameters: clientselinux clientdistro=el8.6 testlist=sanity-selinux
Test-Parameters: clientselinux clientdistro=el7.9 testlist=sanity-selinux
Test-Parameters: clientselinux clientdistro=el7.9 testlist=sanity-selinux
Signed-off-by: Etienne AUJAMES <etienne.aujames@cea.fr>
Change-Id: I4dac87ac0341b45a1c2fef836cdce0361017b3f5
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49630
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 months agoLU-14645 tests: test lfs setdirstripe with '/$' 50/49650/4
Jian Yu [Wed, 8 Mar 2023 20:24:18 +0000 (12:24 -0800)]
LU-14645 tests: test lfs setdirstripe with '/$'

This patch improves one of the lfs setdirstripe tests to
verify that dir name ending with '/' also works.

Lustre-change: https://review.whamcloud.com/49463
Lustre-commit: 4b9a39d3ed58a664a2498911ca1d3c9073c13bd3

Test-Parameters: trivial mdscount=2 mdtcount=4 \
env=ONLY=24B testlist=sanity

Change-Id: I237d5a9ebad42cc0569aa1db487d0df147372316
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49650
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 months agoLU-15833 llapi: don't use realpath in llapi_search_fsname() 49/49649/4
Etienne AUJAMES [Fri, 10 Feb 2023 01:17:58 +0000 (17:17 -0800)]
LU-15833 llapi: don't use realpath in llapi_search_fsname()

This patch use st_dev value to dertermine the fsname in
llapi_search_fsname().
The main purpose of this is to limit the number of lstat()
(realpath()) in this function.

get_root_path() is modified to search a mountpoint by dev.
And the last results of get_root_path() is cached to avoid reading
/proc/mount for each call.

A new api function llapi_search_rootpath_by_dev() is added to get
the path of Lustre mountpoint using the specified device value.

**Testing:**

*Environement:*
VMs: 1 client, 1 MDS (2MDT), 1 OSS (2 OST)
Lustre tree: test{001..100}/test{001..100}/test{01..10}/file{01..05}
(500000 files + 110100 folders)
OS: Centos 7 (no statx)
Lustre: 2.15.50_15_g1116739

*Tests*
cd <rootfs>
strace lfs getstripe -r .
echo 3 > /proc/sys/vm/drop_caches
/usr/bin/time lfs getstripe -r . (2 iterations)

*Results*
times (s):

                 ______________________________
                | user | system | real | real% |
 _______________|______|________|______|_______|
|without patch: | 6.18 | 57.3   | 427  | 0%    |
|_______________|______|________|______|_______|
|with patch:    | 2.88 | 47.3   | 404  |-5.45% |
|_______________|______|________|______|_______|

strace (only significant changes are displayed):
(*stat = lstat + stat + fstat)
                 _____________________________________________
                | *stat  | mmap   | open   | read   | all     |
 _______________|________|________|________|________|_________|
|without patch: | 760545 | 110142 | 330379 | 330325 | 4742658 |
|_______________|________|________|________|________|_________|
|with patch:    | 440484 | 0      | 220277 | 19     | 3541739 |
|_______________|________|________|________|________|_________|

-25.32% syscalls after patching.

Lustre-change: https://review.whamcloud.com/47258
Lustre-commit: 4fd7d5585d33240a658f57bf7399da4415a7eb6c

Signed-off-by: Etienne AUJAMES <etienne.aujames@cea.fr>
Change-Id: I3812d922d5b1d194d52132cba95d11820424c5d7
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49649
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Etienne AUJAMES <eaujames@ddn.com>
13 months agoLU-11695 som: disabling xattr cache for LSOM on client 52/49952/3
Qian Yingjin [Wed, 8 Mar 2023 20:16:53 +0000 (12:16 -0800)]
LU-11695 som: disabling xattr cache for LSOM on client

To obtain uptodate LSOM data, currently a client needs to set
llite.*.xattr_cache =0 to disable the xattr cache on client
completely. This leads that other kinds of xattr can not be cached
on the client too.
This patch introduces a heavy-weight solution to disable caching
only for LSOM xattr data ("trusted.som") on client.

Lustre-change: https://review.whamcloud.com/33711
Lustre-commit: 192902851d73ec246af92a2ff7be8f23b08c4343

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: Iab5ef3030b05ac09184d01f2a3a8ed92ff1cf26b
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49952
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 months agoLU-16292 llite: delete_from_page_cache not exported 21/49121/6
Shaun Tancheff [Wed, 8 Mar 2023 20:13:19 +0000 (12:13 -0800)]
LU-16292 llite: delete_from_page_cache not exported

Linux commit v5.16-rc4-44-g452e9e6992fe
filemap: Add filemap_remove_folio and __filemap_remove_folio

Directly removing a folio/page from the page cache is not
available.

Fallback to generic_error_remove_page for regular files,
and truncate_inode_pages_range as appropriate.

Lustre-change: https://review.whamcloud.com/49069
Lustre-commit: 738e69d4b97d28ef037fe50f4146aabead9a2528

Test-Parameters: trivial
HPE-bug-id: LUS-11198
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I634e7d7719d497ce035a78b424be8e9e8c5a8104
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: jsimmons <jsimmons@infradead.org>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49121
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
13 months agoLU-16118 build: Workaround __write_overflow_field errors 18/49118/6
Shaun Tancheff [Wed, 8 Mar 2023 20:08:27 +0000 (12:08 -0800)]
LU-16118 build: Workaround __write_overflow_field errors

Linux commit v5.17-rc3-1-gf68f2ff91512
   fortify: Detect struct member overflows in memcpy() at compile-time

memcpy and memset of collections of struct members
will trigger:

error: call to ‘__write_overflow_field’ declared with attribute
   warning: detected write beyond size of field (1st parameter);
   maybe use struct_group()?
   [-Werror] __write_overflow_field(p_size_field, size);

Lustre-change: https://review.whamcloud.com/48364
Lustre-commit: a3a51806ef361f55421a1bc07f64c78730ae50d5

Test-Parameters: trivial
HPE-bug-id: LUS-11194
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Iacd1ab03d1b90ce62b5d7b65e1cd518a5f7981f2
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: jsimmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49118
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>