Whamcloud - gitweb
fs/lustre-release.git
2 years agoEX-6807 tests: sync MDT for sanity-lipe-find3/109
Alexandre Ioffe [Wed, 1 Feb 2023 07:00:50 +0000 (23:00 -0800)]
EX-6807 tests: sync MDT for sanity-lipe-find3/109

Added sync all data to ensure that new created directory
flushed on MDT and lipe_scan3 works reliably
when scans inodes

Test-Parameters: trivial mdscount=2 mdtcount=4 ostcount=8 env=ONLY=109 testlist=sanity-lipe-find3
Test-Parameters: trivial testgroup=full-dne-part-2
Signed-off-by: Alexandre Ioffe <aioffe@ddn.com>
Change-Id: If0e8b7de024a1c8c7e1124173e9b3472ea8616b0
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49860
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Colin Faber <cfaber@ddn.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 years agoLU-16159 osp: destroy should not overtake writes
Alex Zhuravlev [Thu, 26 Jan 2023 07:34:25 +0000 (10:34 +0300)]
LU-16159 osp: destroy should not overtake writes

use transaction versioning for object destroy so that
destroy doesn't overtake writes, so writes don't hit
non-existing objects.

Lustre-change: https://review.whamcloud.com/49787
Lustre-commit: e3367a423ae09fcf133ecb7d9b21abfe549e22c6

Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single env=ONLY="70b 71a 119",ONLY_REPEAT=10
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single env=ONLY="70b 71a 119",ONLY_REPEAT=10
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single env=ONLY="70b 71a 119",ONLY_REPEAT=10
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single env=ONLY="70b 71a 119",ONLY_REPEAT=10
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single env=ONLY="70b 71a 119",ONLY_REPEAT=10
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single env=ONLY="70b 71a 119",ONLY_REPEAT=10
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single env=ONLY="70b 71a 119",ONLY_REPEAT=10
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single env=ONLY="70b 71a 119",ONLY_REPEAT=10
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single env=ONLY="70b 71a 119",ONLY_REPEAT=10
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single env=ONLY="70b 71a 119",ONLY_REPEAT=10
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single env=ONLY="70b 71a 119",ONLY_REPEAT=10
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single env=ONLY="70b 71a 119",ONLY_REPEAT=10
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single env=ONLY="70b 71a 119",ONLY_REPEAT=10
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single env=ONLY="70b 71a 119",ONLY_REPEAT=10
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single env=ONLY="70b 71a 119",ONLY_REPEAT=10
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single env=ONLY="70b 71a 119",ONLY_REPEAT=10
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single env=ONLY="70b 71a 119",ONLY_REPEAT=10
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single env=ONLY="70b 71a 119",ONLY_REPEAT=10
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single env=ONLY="70b 71a 119",ONLY_REPEAT=10
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single env=ONLY="70b 71a 119",ONLY_REPEAT=10
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single env=ONLY="70b 71a 119",ONLY_REPEAT=10
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single env=ONLY="70b 71a 119",ONLY_REPEAT=10
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single env=ONLY="70b 71a 119",ONLY_REPEAT=10
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single env=ONLY="70b 71a 119",ONLY_REPEAT=10
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single env=ONLY="70b 71a 119",ONLY_REPEAT=10
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single env=ONLY="70b 71a 119",ONLY_REPEAT=10
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single env=ONLY="70b 71a 119",ONLY_REPEAT=10
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single env=ONLY="70b 71a 119",ONLY_REPEAT=10
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single env=ONLY="70b 71a 119",ONLY_REPEAT=10
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single env=ONLY="70b 71a 119",ONLY_REPEAT=10
Fixes: b054fcd785 ("LU-16159 lod: cancel update llogs upon recovery abort")
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Iec2a5c72f27825820d36ebbe20d55fa303358982
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49830
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-14909 test: mkdir_on_mdt0 to mkdir on MDT0
Lai Siyao [Tue, 3 Aug 2021 15:56:47 +0000 (11:56 -0400)]
LU-14909 test: mkdir_on_mdt0 to mkdir on MDT0

Many sub tests in recovery-small.sh and replay-single.sh need to mkdir
on MDT0, use mkdir_on_mdt0() to create such directories.

Lustre-change: https://review.whamcloud.com/44544
Lustre-commit: 96fd8f03c57d778fd40055b58f54b7310b704adc

Test-Parameters: trivial mdscount=2 mdtcount=4 testlist=replay-single.sh
Test-Parameters: mdscount=2 mdtcount=4 testlist=recovery-small.sh
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Id4a44da062350ea284f51c8c821302aebbfe9dee
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49946
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-16101 tests: add sanity/27J to always_except
Jian Yu [Sun, 12 Feb 2023 08:02:08 +0000 (00:02 -0800)]
LU-16101 tests: add sanity/27J to always_except

This patch adds sanity/27J to always_except for SLES15 SP4
and 5.16.0+ kernels before the issue introduced by upstream
commit 8c8387ee3f55
("mm: stop filemap_read() from grabbing a superfluous page")
is resolved.

Lustre-change: https://review.whamcloud.com/49970
Lustre-commit: TBD (from a537ec478649579ab66866bae85083c68c82e96b)

Test-Parameters: trivial clientdistro=sles15sp4 testlist=sanity

Signed-off-by: Jian Yu <yujian@whamcloud.com>
Change-Id: Iafde656530fcdc1de9265aacaa9266435c9d5c47
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49972
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Xing Huang <hxing@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 years agoRM-620 build: New tag 2.14.0-ddn76
Andreas Dilger [Wed, 8 Feb 2023 05:54:09 +0000 (22:54 -0700)]
RM-620 build: New tag 2.14.0-ddn76

New tag 2.14.0-ddn76

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Iac23eaba4042970458eac24a01ce180d303fd5ff

2 years agoLU-16374 enc: align Base64 encoding with RFC 4648 base64url
Sebastien Buisson [Sun, 18 Jul 2021 00:01:25 +0000 (19:01 -0500)]
LU-16374 enc: align Base64 encoding with RFC 4648 base64url

Lustre encryption uses a Base64 encoding to encode no-key filenames
(the filenames that are presented to userspace when a directory is
listed without its encryption key).
Make this Base64 encoding compliant with RFC 4648 base64url. And use
'+' leading character to distringuish digested names.

This is adapted from kernel commit
ba47b515f594 fscrypt: align Base64 encoding with RFC 4648 base64url

To maintain compatibility with older clients, a new llite parameter
named 'filename_enc_use_old_base64' is introduced, set to 0 by
default. When 0, Lustre uses new-fashion base64 encoding. When set to
1, Lustre uses old-style base64 encoding.

To set this parameter globally for all clients, do on the MGS:
mgs# lctl set_param -P llite.*.filename_enc_use_old_base64={0,1}

Lustre-change: https://review.whamcloud.com/49581
Lustre-commit: 583ee6911b6cac7f2867a37101cc069b4011b73f

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Iaa2256da7fb591d842b5bb7aa474b2ee6de9899d
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49900
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-16415 quota: enforce project quota for root
Sergey Cheremencev [Sat, 17 Dec 2022 21:42:10 +0000 (01:42 +0400)]
LU-16415 quota: enforce project quota for root

Patch adds an option to enforce project quotas for root.
It is disabled by default, to enable set
osd-ldiskfs.*.quota_slave.root_prj_enable to 1
at each target where you need this option.

Patch also adds sanity-quota_1j to test new feature.

Lustre-change: https://review.whamcloud.com/49460
Lustre-commit: f147655c33ea61450105b602c82da900fd1417b5

Signed-off-by: Sergey Cheremencev <scherementsev@ddn.com>
Change-Id: I978dc8442235149794f85110309f63bc560defdc
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49812
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-12275 tests: skip new nodemap params on old MGS
Andreas Dilger [Mon, 30 Jan 2023 21:56:41 +0000 (14:56 -0700)]
LU-12275 tests: skip new nodemap params on old MGS

Skip setting forbid_encryption and readonly_mount parameters on old
MGSes that do not support these options.  Otherwise test_61 failures
are seen during interop testing.  Running test_36 would also fail in
this case, except that it is already skipped due to encryption checks.

Lustre-change: https://review.whamcloud.com/49828
Lustre-commit: TBD (from 74c9b1f6e4d5b1fbbd615b87fb7c62c0fcb1a727)

Test-Parameters: trivial testlist=sanity-sec
Fixes: 598c48707c ("LU-12275 tests: exercise file content encryption/decryption")
Fixes: e7ce67de92 ("LU-15451 sec: read-only nodemap flag")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I94f2e2f609927fea618a3a22f103bd32ae3ebbe5
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49829
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-9859 libcfs: add "default" keyword for debug mask
Andreas Dilger [Fri, 21 Jan 2022 07:20:56 +0000 (00:20 -0700)]
LU-9859 libcfs: add "default" keyword for debug mask

Allow "lctl set_param debug=default" to reset the debug mask to
the default value.  This is useful if the debug needs to be set
to a higher value temporarily, but should be easily reset back
to the original value afterward.

Lustre-change: https://review.whamcloud.com/46251
Lustre-commit: 4c9a5762413638cc630b1facfb565dcd765fce1e

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I7d0d8fb81e51afb5ea6f29abea0d0814de3ebbe5
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49875
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-16191 socklnd: limit retries on conns_per_peer mismatch
Serguei Smirnov [Mon, 26 Sep 2022 23:47:24 +0000 (16:47 -0700)]
LU-16191 socklnd: limit retries on conns_per_peer mismatch

If connection initiator has a higher conns-per-peer setting than
its peer, don't try to create extra connections forever as the
peer will keep rejecting them. A few retries should suffice to
resolve a valid race.

Lustre-change: https://review.whamcloud.com/48664
Lustre-commit: da893c6c9707ca3b2e7532d05f754fccf1cffc74

Test-Parameters: trivial
Fixes: 71b2476e ("LU-12815 socklnd: add conns_per_peer parameter")
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: I7d04d4ac41e98a738b6c85c3d323608038f5c51e
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49914
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-16125 tests: make sanity-sec more robust with SSK
Sebastien Buisson [Tue, 30 Aug 2022 09:22:34 +0000 (11:22 +0200)]
LU-16125 tests: make sanity-sec more robust with SSK

Encryption related tests in sanity-sec carry out unmount and mount of
clients in order to exercise code with and without the encryption key.
In case SSK is in use, we need to make sure flavors are properly
applied before carrying on.

Lustre-change: https://review.whamcloud.com/48386
Lustre-commit: bee889e87584aa3bd2e6819db73d6adf129460ee

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I92e85dc6dcef43f70a7fe05db94cd18fe66a3a24
Reviewed-by: Zhenyu Xu <bobijam@hotmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49892
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoEX-6808 tests: allow run_lfsck() to skip error
Jian Yu [Fri, 3 Feb 2023 21:05:47 +0000 (13:05 -0800)]
EX-6808 tests: allow run_lfsck() to skip error

If e2fsck or lfsck are repairing any corruption it means
the filesystem was corrupted by something, and that
shouldn't happen during testing. Running run_lfsck() will
fail and report an error.

However, after running sanityn test 70b, there will be
some known corruptions need to be repaired. We have to
make run_lfsck() support skipping some allowed errors
so as to make sanity-lipe-find3 and sanity-lipe-find3
tests proceed.

Test-Parameters: trivial testlist=sanityn,sanity-lipe-scan3
Test-Parameters: trivial testlist=sanityn,sanity-lipe-find3
Test-Parameters: trivial mdscount=2 mdtcount=4 \
testlist=sanityn,sanity-lipe-scan3
Test-Parameters: trivial mdscount=2 mdtcount=4 \
env=SANITY_LIPE_FIND3_EXCEPT=109 \
testlist=sanityn,sanity-lipe-find3

Fixes: f9ba28af02 ("EX-6692 tests: run LFSCK in sanity-lipe-scan3/find3")
Change-Id: I686820a70b1a393611d90e286febfd3512fd40f7
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49903
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 years agoLU-16457 tests: wait for remote sleep in sanity-pcc/101a
Andreas Dilger [Tue, 10 Jan 2023 15:37:03 +0000 (08:37 -0700)]
LU-16457 tests: wait for remote sleep in sanity-pcc/101a

Wait longer for the remote sleep command to start on the agent node.

Lustre-change: https://review.whamcloud.com/49587
Lustre-commit: 4b47c233b308dcfefe77a6a493c01d3b4fc59bbe

Test-Parameters: trivial testlist=sanity-pcc env=ONLY=101a,ONLY_REPEAT=200
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I5dcbd6a7127b3e17aa658c87f5c75874432dc353
Reviewed-by: Feng Lei <flei@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49919
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-16500 utils: set default ost index for lfs migrate
Jian Yu [Wed, 1 Feb 2023 18:33:59 +0000 (10:33 -0800)]
LU-16500 utils: set default ost index for lfs migrate

Running "lfs migrate <file>" without any SETSTRIPE arguments
to balance space usage keeps the PFL file layout, but preserves
the OST selection exactly, which makes the migration virtually
useless for space balancing.

This patch fixes the above issue by clearing the specific
OST indices from the source layout before using the layout to
create the volatile file in lfs_migrate().

Lustre-change: https://review.whamcloud.com/49819
Lustre-commit: TBD (from 58ce6f2c8276df8b1a3b38db016fc301334d589c)

Signed-off-by: Jian Yu <yujian@whamcloud.com>
Change-Id: I82e1dc0a11fdda7d555df994cf4e5f6e3dbdcb5c
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49865
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 years agoLU-16501 tgt: skip free inodes in OST weights
Andreas Dilger [Fri, 3 Feb 2023 10:14:39 +0000 (03:14 -0700)]
LU-16501 tgt: skip free inodes in OST weights

In lu_tgt_qos_weight_calc() calculate the target weight consistently
with how the per-OST and per-OSS penalty calculation is done in
ltd_qos_penalties_calc().  Otherwise, the QOS weighting calculations
combine two different units, which incorrectly weighs allocations on
OST with more free inodes over those with more free space.

Lustre-change: https://review.whamcloud.com/49890
Lustre-commit: TBD (from ab24f031908d100146b2f2900ab88e99e689d236)

Fixes: d3090bb2b486 ("LU-11213 lod: share object alloc QoS code with LMV")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I1ccc52d7ad5dc440ae48403ba129efd6a0a51c33
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49904
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoRM-620 build: New tag 2.14.0-ddn75
Andreas Dilger [Wed, 1 Feb 2023 18:43:09 +0000 (11:43 -0700)]
RM-620 build: New tag 2.14.0-ddn75

New tag 2.14.0-ddn75

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I70de832ee204c96afd2359b03530303b3282d942

2 years agoLU-16412 llite: check read page past requested
Qian Yingjin [Fri, 20 Jan 2023 17:30:27 +0000 (12:30 -0500)]
LU-16412 llite: check read page past requested

Due to a kernel bug introduced in 5.12 in commit:
cbd59c48ae2bcadc4a7599c29cf32fd3f9b78251
("mm/filemap: use head pages in generic_file_buffered_read")
if the page immediately after the current read is in cache,
the kernel will try to read it.

This attempts to read a page past the end of requested
read from userspace, and so has not been safely locked by
Lustre.

For a page after the end of the current read, check wether
it is under the protection of a DLM lock. If so, we take a
reference on the DLM lock until the page read has finished
and then release the reference.  If the page is not covered
by a DLM lock, then we are racing with the page being
removed from Lustre.  In that case, we return
AOP_TRUNCATED_PAGE, which makes the kernel release its
reference on the page and retry the page read.  This allows
the page to be removed from cache, so the kernel will not
find it and incorrectly attempt to read it again.

NB: Earlier versions of this description refer to stripe
boundaries, but the locking issue can occur whether or
not the page is on a stripe boundary, because dlmlocks
can cover part of a stripe.  (This is rare, but is
allowed.)

Lustre-change: https://review.whamcloud.com/49723
Lustre-commit: TBD (from ebfe62b8e8f4db0555a7a39ce5c764059422a260)

Change-Id: Ib93bd0624fda0ed1c2b89f609d15208c86e21c29
Signed-off-by: Qian Yingjin <qian@ddn.com>
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49658
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-16160 llite: SIGBUS is possible on a race with page reclaim
Andrew Perepechko [Sun, 15 Jan 2023 16:55:58 +0000 (11:55 -0500)]
LU-16160 llite: SIGBUS is possible on a race with page reclaim

We can restart fault handling if page truncation happens
in parallel with the fault handler.

Lustre-change: https://review.whamcloud.com/49647
Lustre-commit: b4da788a819f82d35b685d6ee7f02809c05ca005

Include updates to rw_seq_cst_vs_drop_caches.c from 5b911e0326
to add the '-m' option to test mmap IO operations in sanityn/16g.

Change-Id: I6e60783e3334f87e799dc8b0e6e63d0bb358a236
Signed-off-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Signed-off-by: Patrick Farrell <farr0186@gmail.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49831
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-14896 utils: migrate file with only '--pool' option
Etienne AUJAMES [Mon, 2 Aug 2021 10:26:58 +0000 (12:26 +0200)]
LU-14896 utils: migrate file with only '--pool' option

"lfs migrate -p pool_name test_file" initiate a migration but without
changing the layout pools (migrate from layout copy).

This patch implements the same behavior that:
"lfs setstripe -p pool_name test_file"
It sets the pool name and uses the default parameters for the plain
layout.

Add sanity test 56xg to check file migrations with pool.

Lustre-change: https://review.whamcloud.com/44465
Lustre-commit: 0c60662b3a389790b19736a60063a1208e06bf70

Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Change-Id: I1645eaca028974337218411d6a033f3acf9b9d6a
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49774
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sergey Cheremencev <scherementsev@ddn.com>
2 years agoLU-16345 ofd: ofd_commitrw_read() with non-existing object
Alex Zhuravlev [Mon, 28 Nov 2022 09:17:25 +0000 (12:17 +0300)]
LU-16345 ofd: ofd_commitrw_read() with non-existing object

a client can get evicted during OST_READ's bulk so it's LDLM
lock is cancelled and OST_DESTOY can remove the object.
ofd_commitrw_read() still needs to release the buffers and
ignore the object doesn't exist.

Lustre-change: https://review.whamcloud.com/49255
Lustre-commit: 5efc4c1cb4f2d0680992188d587f583e7a567a09

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Ibe9413de41c23b1b4f6d52e9b17a06590b3c0726
Reviewed-by: Patrick Farrell <farr0186@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49809
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-16285 ldlm: send the cancel RPC asap
Yang Sheng [Sat, 14 Jan 2023 17:56:14 +0000 (01:56 +0800)]
LU-16285 ldlm: send the cancel RPC asap

This patch try to send cancel RPC ASAP when bl_ast
received from server. The exist problem is that
lock could be added in regular queue before bl_ast
arrived since other reason. It will prevent lock
canceling in timely manner. The other problem is
that we collect many locks in one RPC to save
the network traffic. But this process could take
a long time when dirty pages flushing.

 - The lock canceling will be processed even lock has
   been added to bl queue while bl_ast arrived. Unless
   the cancel RPC has been sent.
 - Send the cancel RPC immediatly for bl_ast lock. Don't
   try to add more locks in such case.

Lustre-change: https://review.whamcloud.com/49527
Lustre-commit: b65374d96b2027213f253e128d3e5b3943ff2240

Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: Ie5efff3f1ed4e46448371185a0c08968233e7644
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49651
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-16445 sec: make nodemap root squash independent of map_mode
Sebastien Buisson [Thu, 5 Jan 2023 14:06:39 +0000 (15:06 +0100)]
LU-16445 sec: make nodemap root squash independent of map_mode

When the admin property is set to 0 on a nodemap, the root user must
be squashed, even if the map_mode property specifies to not map uids
or gids.

Enhance sanity-sec test_17 to exercise this use case.

Lustre-change: https://review.whamcloud.com/49561
Lustre-commit: 1335eb1d599ceb6423de6800e0995614cdb37bd8

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I1b41caa1ccc6e544ce9fac45b47d0c4c129221f7
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49797
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoEX-6791 lipe: lamigo, lpurge are not for customer use
Alexandre Ioffe [Thu, 26 Jan 2023 22:11:04 +0000 (14:11 -0800)]
EX-6791 lipe: lamigo, lpurge are not for customer use

Lamigo and lpurge helps notice that they are not intended
for direct customer use

Signed-off-by: Alexandre Ioffe <aioffe@ddn.com>
Test-Parameters: trivial
Change-Id: I36ba2da080156da2d62ffa215cd7eb98b5c10adc
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49794
Reviewed-by: Colin Faber <cfaber@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoRM-620 build: New tag 2.14.0-ddn74
Andreas Dilger [Wed, 25 Jan 2023 03:23:00 +0000 (20:23 -0700)]
RM-620 build: New tag 2.14.0-ddn74

New tag 2.14.0-ddn74

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I0f3d9cfe21d8b9a3829edcc357d59a24b1e3084e

2 years agoLU-16228 tests: skip sanity/205e in interop tests
Andreas Dilger [Wed, 25 Jan 2023 00:51:18 +0000 (17:51 -0700)]
LU-16228 tests: skip sanity/205e in interop tests

Add a version check to sanity.sh test_205e and update the check
in test_205d to match the actual patch version that lljobstat
was landed in, so that it is not run during interop testing.

Test-Parameters: trivial testlist=sanity env=ONLY=205 serverversion=EXA6.1.0
Fixes: e9f9822822 ("LU-16228 utils: add lljobstat util")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I34d517c4b33e88f316cedbd94c8f48ace63ebbe5
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49756
Tested-by: jenkins <devops@whamcloud.com>
2 years agoLU-16486 kernel: kernel update RHEL8.7 [4.18.0-425.10.1.el8_7]
Jian Yu [Thu, 19 Jan 2023 20:27:45 +0000 (12:27 -0800)]
LU-16486 kernel: kernel update RHEL8.7 [4.18.0-425.10.1.el8_7]

Update RHEL8.7 kernel to 4.18.0-425.10.1.el8_7.

Lustre-change: https://review.whamcloud.com/49683
Lustre-commit: TBD (from 390b84b102f63ab8daade91b4a34960d097028d1)

Test-Parameters: trivial fstype=ldiskfs \
clientdistro=el8.7 serverdistro=el8.7 testlist=sanity

Test-Parameters: trivial fstype=zfs \
clientdistro=el8.7 serverdistro=el8.7 testlist=sanity

Change-Id: I5759d0cb06a1148689ed9b8c947cb6516ab3aca1
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49708
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 years agoLU-16490 kernel: kernel update RHEL 7.9 [3.10.0-1160.81.1.el7]
Jian Yu [Thu, 19 Jan 2023 20:30:18 +0000 (12:30 -0800)]
LU-16490 kernel: kernel update RHEL 7.9 [3.10.0-1160.81.1.el7]

Update RHEL 7.9 kernel to 3.10.0-1160.81.1.el7.

Lustre-change: https://review.whamcloud.com/49684
Lustre-commit: TBD (from 0a6b9460584046c0344204ad5169efac4d791e59)

Test-Parameters: trivial clientdistro=el7.9 serverdistro=el7.9

Change-Id: I46f556f327d92fde17790e223187df5b1c33d2c1
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49709
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 years agoLU-16165 sec: retry mechanism for identity cache
Sebastien Buisson [Fri, 16 Sep 2022 16:02:51 +0000 (18:02 +0200)]
LU-16165 sec: retry mechanism for identity cache

Implement a retry mechanism in the identity cache in case the
identity up call times out.

Lustre-change: https://review.whamcloud.com/48579
Lustre-commit: 61c3b3a9bb848e256845462ffd79b15565cd23ad

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Ib70d3b851a6da3cf66dfed49b03be51da7886d01
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49747
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoEX-6692 tests: run LFSCK in sanity-lipe-scan3/find3
Jian Yu [Mon, 23 Jan 2023 19:43:02 +0000 (11:43 -0800)]
EX-6692 tests: run LFSCK in sanity-lipe-scan3/find3

Some tests running "lfs rm_entry" will leave the FID in
.lustre/fid/ undeleted, which causes sanity-lipe-scan3
and sanity-lipe-find3 fail. Since it is not possible to
list the .lustre/fid/ directory, we have to run LFSCK to
link the FID back into .lustre/lost+found so as to
check if the file system needs to be reformatted.

Test-Parameters: trivial testlist=sanityn,sanity-lipe-scan3
Test-Parameters: trivial testlist=sanityn,sanity-lipe-find3

Fixes: 933691b3d7 ("EX-6692 tests: clean up test env in sanity-lipe-scan3.sh")
Fixes: de1bb57641 ("EX-6170 tests: make sanity-lipe-scan3.sh support remote MDS")
Fixes: 8b572c4de0 ("EX-6169 lipe: sanity-lipe-find3 reformat to clean lost+found")
Change-Id: If23479fc222052e25a7f21bcb70003c5176247b6
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49674
Reviewed-by: Alexandre Ioffe <aioffe@ddn.com>
Reviewed-by: Colin Faber <cfaber@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-16026 llite: always enable remote subdir mount
Lai Siyao [Sun, 28 Aug 2022 19:33:29 +0000 (15:33 -0400)]
LU-16026 llite: always enable remote subdir mount

For historical reason, ROOT is revalidated with IT_LOOKUP in
.permission to ensure permission is update to date because ROOT is
never looked up. But ROOT FID and layout is not changeable, it's
PERM lock that should be revalidated, i.e., revalidate with
IT_GETATTR instead of IT_LOOKUP.

Since PERM|UPDATE lock is on the MDT where object is located, client
can cache this lock, therefore remote subdir mount doesn't need to
lookup ROOT in each file access.

Deprecate mdt.*.enable_remote_subdir_mount.

Per http://review.whamcloud.com/19195, replace 'df' with 'lfs df' in
sanity 228b since the former doesn't support transparent recovery.

Add sanity 247h.

Lustre-change: https://review.whamcloud.com/48535
Lustre-commit: 6f490275b0e0455a431707775d685fb3df1d322d

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I66f8ee347f6c01a8a154245b10a1d93539ea13b8
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49673
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 years agoLU-15082 osp: invalidate statfs data from the timer callback
Alex Zhuravlev [Tue, 12 Oct 2021 05:26:21 +0000 (08:26 +0300)]
LU-15082 osp: invalidate statfs data from the timer callback

osp_statfs_timer_cb() can be called just before statfs data gets
stale. this in turn may cause early wakeup to the precreate thread
which would find statfs data still up-to-data and go back to slepp.
if no precreate happens to this OSP (e.g. due to current space
usage), then the precreate thread will stay asleep for a long time,
statfs data won't get refreshed and this may block new objects
on the corresponding OST.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I86e16eed6f1068702db696a9ddec7a22994180b7
Reviewed-on: https://review.whamcloud.com/45199
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49694

2 years agoLU-16380 osd-ldiskfs: race in OI mapping
Lai Siyao [Sat, 17 Dec 2022 13:06:16 +0000 (08:06 -0500)]
LU-16380 osd-ldiskfs: race in OI mapping

There is race in OI scrub thread and OI mapping entry insertion, which
may add an inconsistent OI mapping entry, but not started OI scrub
thread. This may lead to osd_fid_lookup() always returns -EINPROGRESS.

To avoid such race, osd_fid_lookup() returns -EINPROGRESS only when
OI mapping is inconsistent, and OI scrub thread is not running.

Lustre-change: https://review.whamcloud.com/49514
Lustre-commit: 43fe6e51804f8fb4cca4445be576233595e27b42

Fixes: 558784caad ("LU-15643 osd-ldiskfs: don't trigger scrub on irreparable FIDs")
Test-Parameters: mdscount=2 mdtcount=4 testlist=conf-sanity env=ONLY=108b,ONLY_REPEAT=50
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I05114b6a33940c210e9952f6e24f6c36fd7f76a2
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49719
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-16335 test: add fail_abort_cleanup()
Lai Siyao [Wed, 7 Dec 2022 04:04:42 +0000 (23:04 -0500)]
LU-16335 test: add fail_abort_cleanup()

Add helper fail_abort_cleanup() to unlink test directories (call lfs
rm_entry if directory is broken) after fail_abort because after
LU-16159 update logs will be canceled upon recovery abort, which may
leave broken directories.

Update replay-single.sh in places where fail_abort is called and
directory may become broken.

Lustre-change: https://review.whamcloud.com/49335
Lustre-commit: d5fe41a02a6ed57bcbfc4a4c695bb509c9c7c313

Test-Parameters: trivial mdscount=2 mdtcount=4 testlist=replay-single
Test-Parameters: trivial mdscount=2 mdtcount=4 testlist=replay-single
Test-Parameters: trivial mdscount=2 mdtcount=4 testlist=replay-single
Test-Parameters: trivial mdscount=2 mdtcount=4 testlist=replay-single
Test-Parameters: trivial mdscount=2 mdtcount=4 testlist=replay-single
Test-Parameters: trivial mdscount=2 mdtcount=4 testlist=replay-single
Test-Parameters: trivial mdscount=2 mdtcount=4 testlist=replay-single
Test-Parameters: trivial mdscount=2 mdtcount=4 testlist=replay-single
Test-Parameters: trivial mdscount=2 mdtcount=4 testlist=replay-single
Test-Parameters: trivial mdscount=2 mdtcount=4 testlist=replay-single
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I260689b1a6fa5b0b4db5aab5095cb062ae57d612
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49713
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-16335 mdt: skip target check for rm_entry
Lai Siyao [Wed, 7 Dec 2022 02:53:25 +0000 (21:53 -0500)]
LU-16335 mdt: skip target check for rm_entry

For "lfs rm_entry", target may not exist, sanity check of it may fail
thus causes rm_entry fail.

Add sanity 832.

Lustre-change: https://review.whamcloud.com/49329
Lustre-commit: ae98c5fdaaf37daeb328b7110cbcf42754752c9d

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I824c7581af05c7494cf03c0c9bc999ca1abfec01
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49712
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-16335 build: remove _GNU_SOURCE dependency in lustre_user.h
Lai Siyao [Thu, 1 Dec 2022 08:17:00 +0000 (03:17 -0500)]
LU-16335 build: remove _GNU_SOURCE dependency in lustre_user.h

The lustre_user.h header uses the non-standard strchrnul() function
in userspace.  This will always leads to LC_IOC_REMOVE_ENTRY configure
check to fail, and in the end "lfs rm_entry" always returns -ENOTSUP.

Implement an alternative approach to avoid external dependencies on
the lustre_user.h header.  Also, LC_IOC_REMOVE_ENTRY is itself
unnecessary, the code can check for LL_IOC_REMOVE_ENTRY directly.

Replace the NFS-specific -ENOTSUP error return code with -EOPNOTSUPP.

Fix the compile test_400[ab] checks to not use "-std=c99" to verify
that the uapi headers are usable without this dependency.

Lustre-change: https://review.whamcloud.com/49328
Lustre-commit: efc5c8d4de60d394344506f7cfb188eaf04a4bac

Fixes: b59835f8b6 ("LU-13903 utils: have liblustreapi support Linux client")
Fixes: 7a7309fa84 ("LU-13274 uapi: make lustre UAPI headers C99 compliant")
Fixes: 6331eadbd6 ("LU-15420 uapi: avoid gcc-11 -Werror=stringop-overread")
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: If42743a2148c317b8a9b701ceb5d08bac5149f5f
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49711
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-16159 lod: cancel update llogs upon recovery abort
Lai Siyao [Sun, 28 Aug 2022 18:35:25 +0000 (14:35 -0400)]
LU-16159 lod: cancel update llogs upon recovery abort

If recovery is aborted, cancel update catalog from catlist, and keep
them on disk for some time (for debug purpose), as can avoid
accumulating stale update records, and also avoid recovery problems
if update llogs are corrupt.

Update llogs are canceled after recovery completes and before regular
request processing. For these logs, their ctime will be set, and log
header will be marked with LLOG_F_MAX_AGE|LLOG_F_RM_ON_ERR, and when
30 days passed, they will be removed automatically.

Tidy up recovery abort code:
* if obd_abort_recovery is set, or OBD is stopping, stop both
  client recovery and MDT recovery.
* otherwise if obd_abort_mdt_recovery is set, stop MDT recovery only.

lctl llog_print support printing update log FIDs used by specified
MDT:
* "lctl --device <MDT> llog_print update_log" will list all update
  llog FIDs used by this MDT device.

Disabled replay-single.sh 100c stripe check because abort_recovery
will cancel update llogs, and won't replay them upon next recovery.

Added replay-single.sh 100d.

Formatall in the end of replay-single.sh because directory unlink may
fail.

Lustre-change: https://review.whamcloud.com/48584
Lustre-commit: b054fcd7852f6a22f8ec469ce94ddf6f3331ab34

Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Ie2bda6c097d65f5c51cba66c2dbf6ae4a5d36dda
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49403
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-16228 utils: add lljobstat util
Lei Feng [Mon, 17 Oct 2022 05:36:14 +0000 (13:36 +0800)]
LU-16228 utils: add lljobstat util

lljobstat util read datas from job_stats file(s),
parse, aggregate the data and list top jobs.

For example:
$ ./lljobstats -n 1 -c 3
---
timestamp: 1665984678
top_jobs:
- ll_sa_3508505.0: {ops: 64, ga: 64}
- touch.500:       {ops: 6, op: 1, cl: 1, mn: 1, ga: 1, sa: 2}
- bash.0:          {ops: 3, ga: 3}
...

Includes part of "LU-16110 lprocfs: make job_stats and
rename_stats valid YAML" to make rename_stats valid
and verify the YAML output.

Includes "LU-16459 tests: fix YAML verification function"
to fix the test case of LU-16110.

Lustre-change: https://review.whamcloud.com/48888
Lustre-commit: TBD (from 08836199bbd26bdc1a800f5710691d9b6723b1eb)

Change-Id: I0c4ac619496c184a5aebbaf8674f5090ab722d72
Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: trivial
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49560
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 years agoRM-620 build: New tag 2.14.0-ddn73
Andreas Dilger [Tue, 17 Jan 2023 19:44:29 +0000 (12:44 -0700)]
RM-620 build: New tag 2.14.0-ddn73

New tag 2.14.0-ddn73

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I654b9727f7a52d24b81d07aa1b001567ad064681

2 years agoEX-6692 tests: clean up test env in sanity-lipe-scan3.sh
Jian Yu [Tue, 17 Jan 2023 07:00:05 +0000 (23:00 -0800)]
EX-6692 tests: clean up test env in sanity-lipe-scan3.sh

This patch reformats the file system in sanity-lipe-scan3.sh
to clean up the test env before running subtests.

Test-Parameters: trivial testlist=sanity-lipe-scan3
Test-Parameters: trivial mdscount=2 mdtcount=4 \
testlist=sanity-lipe-scan3

Fixes: de1bb57641 ("EX-6170 tests: make sanity-lipe-scan3.sh support remote MDS")
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Change-Id: I301149f7f585716fcb39ba9065c2f372fb075344
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49621
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 years agoLU-16385 obdlcass: stop MGC before MGS
Alex Zhuravlev [Mon, 12 Dec 2022 13:35:41 +0000 (16:35 +0300)]
LU-16385 obdlcass: stop MGC before MGS

drops a reference to MGC when MGS is being umounted so that
MGC doesn't try to disconnected from a missing MGS which
can take long and hurt HA.

Lustre-change: https://review.whamcloud.com/49378
Lustre-commit: 817184a9788ae399dcd5cf53ae7c9801e4778a43

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Ib15f1ca56c47201bf6e29c12b3f81a11e55944ca
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49641
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 years agoLU-14377 tests: make parallel-scale/rr_alloc less strict
Andreas Dilger [Wed, 19 Oct 2022 00:37:58 +0000 (18:37 -0600)]
LU-14377 tests: make parallel-scale/rr_alloc less strict

test_rr_alloc() sometimes fails with a difference of 3-4 objects
per OST, after creating 1500+ objects on each OST.  This should
not be considered fatal.  Make the test more lenient, and allow
a difference of up to 0.3% of objects between the OSTs.

Fix some code style issues in the test.

Lustre-change: https://review.whamcloud.com/48914
Lustre-commit: b104c0a27713899a4d047f56fed57c30c39b8195

Test-Parameters: trivial testlist=parallel-scale env=ONLY=rr_alloc
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ib6ba8c5d8e9d3245833448a52f8ed25308698a33
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Elena Gryaznova <elena.gryaznova@hpe.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49607
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-14938 tests: fail_abort() in t-f to take care of MDTs
Alex Zhuravlev [Mon, 16 Aug 2021 17:22:00 +0000 (20:22 +0300)]
LU-14938 tests: fail_abort() in t-f to take care of MDTs

fail_abort() in test-framework ensures that the clients
are back after evictions. the same should be done for
MDTs as otherwise any subsequent test may fail due to
another MDT observing eviction and interrupting current
request with -EIO.

Lustre-change: https://review.whamcloud.com/44671
Lustre-commit: 436cd4fd21ffee5830c9b4e75055db80c47547d5

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I0a00ece52d28c6d28eef029a4f87a348efaa041c
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49598
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-16210 llite: replace selinux_is_enabled()
Etienne AUJAMES [Thu, 6 Oct 2022 13:30:54 +0000 (15:30 +0200)]
LU-16210 llite: replace selinux_is_enabled()

selinux_is_enabled() was removed from kernel 5.1.
The commit 39e5bfa add the kernel support by assuming SELinux to be
enabled if the function selinux_is_enabled() does not exist.

This has performances impacts: on older kernel (e.g: Centos7) getxattr
RPCs was not send for "security.selinux" if selinux was disabled.
Utilities like "ls -l" always try to get "security.selinux".
See the LU-549 for more information.

This patch uses security_inode_listsecurity() when mounting the
client to know if a LSM module (selinux) required a xattr to store
file contexts. If a xattr is returned we store it and use it for in
request security context.

For getxattr/setxattr we use the stored LSM's xattr to filter xattr
security contexts like security.selinux. If xattr does not match the
stored xattr name we returned -EOPNOTSUPP to userspace.

It adds also the s_security check for security_inode_notifysecctx() to
avoid calling this function if selinux is disabled (as in
nfs_setsecurity()).

For "Enforcing SELinux Policy Check" functionnality, the selinux check
have been moved in l_getsepol: -ENODEV is returned if selinux is
disabled.

Add a regresion test "sanity test_434" for this use case.

*Note:*
This patch detects that selinux is disabled without explicitly
disabled it in kernel cmdline. This is recommended for RHEL >= 8.5.

*Performances:*
Tests with "strace -c ls -l" with 100000 files on root in a multi VMs
env (on Rocky 9). FS is remount for each tests (cache is cleaned) and
selinux is disabled.
 __________________ ___________ _________
| Total time %     | lgetxattr | statx   |
|__________________|___________|_________|
|Without the patch:|    29%    |   51%   |
|__________________|___________|_________|
|With the patch:   |    0%     |   87%   |
|__________________|___________|_________|
"ls -l" uses lgetxattr to get "security.selinux".

Linux-commit: 3d252529480c68bfd6a6774652df7c8968b28e41

Lustre-change: https://review.whamcloud.com/48875
Lustre-commit: 1d8faaf6caf4acaf0e2d4943b51c024a96c80624

Fixes: 39e5bfa ("LU-12355 llite: include file linux/selinux.h removed")
Fixes: 9bcac0b ("LU-549 llite: Improve statfs performance if selinux is disabled")
Test-Parameters: clientselinux=false clientdistro=el7.9 testlist=sanity env=ONLY=434,ONLY_REPEAT=20
Test-Parameters: clientselinux=false clientdistro=el8.5 testlist=sanity env=ONLY=434,ONLY_REPEAT=20
Test-Parameters: clientselinux=false clientdistro=el8.6 testlist=sanity env=ONLY=434,ONLY_REPEAT=20
Test-Parameters: clientselinux clientdistro=el8.6 testlist=sanity-selinux
Test-Parameters: clientselinux clientdistro=el8.6 testlist=sanity-selinux
Test-Parameters: clientselinux clientdistro=el7.9 testlist=sanity-selinux
Test-Parameters: clientselinux clientdistro=el7.9 testlist=sanity-selinux
Signed-off-by: Etienne AUJAMES <etienne.aujames@cea.fr>
Change-Id: I4dac87ac0341b45a1c2fef836cdce0361017b3f5
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49628
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-16114 build: Update security_dentry_init_security args
Shaun Tancheff [Sun, 28 Aug 2022 14:38:39 +0000 (21:38 +0700)]
LU-16114 build: Update security_dentry_init_security args

Linux commit v5.15-rc1-20-g15bf32398ad4
   security: Return xattr name from security_dentry_init_security()

Adjust security_dentry_init_security() calls accordingly

Lustre-change: https://review.whamcloud.com/48359
Lustre-commit: 88bccc4fa4dd7310560f588c730eefedf423c515

Test-Parameters: trivial
HPE-bug-id: LUS-11188
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I42d3307f7fe0d2412381363f60ac5b3df2d5891a
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49627
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 years agoEX-6571 tests: Added tests for sanity-lipe-find3
Alexandre Ioffe [Thu, 12 Jan 2023 23:51:12 +0000 (15:51 -0800)]
EX-6571 tests: Added tests for sanity-lipe-find3

Added sanity tests for lipe_find3
-stripe-count
-mirror-count
-path
-ipath

Test-Parameters: trivial testlist=sanity-lipe-find3
Signed-off-by: Alexandre Ioffe <aioffe@ddn.com>
Change-Id: I1180bf0b667372dc9f0d48e4fbf89fbaaca7fdd7
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49596
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 years agoLU-15828 o2iblnd: reset hiw proportionally
Serguei Smirnov [Thu, 22 Dec 2022 22:42:48 +0000 (14:42 -0800)]
LU-15828 o2iblnd: reset hiw proportionally

As a result of connection negotiation, queue depth may end up
being shorter than "peer_tx_credits" tunables value. Before this
patch, the high-water mark "lnd_peercredits_hiw" would be set at
min(current hiw, queue depth - 1).

For example, considering that hiw is allowed to only be as low as
half of peer_tx_credits, negotiating queue_depth/peer_credits down
from 32 to 8 would always result in hiw set at 7, i.e. credits would
be released as late as possible.

With this patch, if queue depth is reduced, hiw is set proportionally
relative to the level it was at before:
hiw = (queue_depth * lnd_peercredits_hiw) / peer_tx_credits

Using the above example with queue depth initially at 32, negotiating
down to 8 would result in hiw set to 4 if "lnd_peercredits_hiw" is
initially at 16, 17, 18, 19; hiw set to 5 if "lnd_peercredits_hiw" is
initially at 20, 21, 22, 23, and so on.

Lustre-change: https://review.whamcloud.com/49497
Lustre-commit: e1944c29793d489429730a9445e243b448c3d751

Test-Parameters: trivial
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: I633933d7448db1ca88d3c65de9c29e870ca2c9fb
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49637
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoEX-6713 doc: man pages for asynchronous PCCRO attachment
Qian Yingjin [Mon, 16 Jan 2023 03:06:50 +0000 (22:06 -0500)]
EX-6713 doc: man pages for asynchronous PCCRO attachment

This patch updates the man pages for asynchronous PCCRO attachment
for "lfs pcc attach -A" command.

Change-Id: I7757a9d0b66a3586abdc9053b73d69944561ffbd
Test-Parameters: trivial
Signed-off-by: Qian Yingjin <qian@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49640
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 years agoRM-620 build: New tag 2.14.0-ddn72
Andreas Dilger [Thu, 12 Jan 2023 01:12:44 +0000 (18:12 -0700)]
RM-620 build: New tag 2.14.0-ddn72

New tag 2.14.0-ddn72

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I61033de82c0777ad4acc90ca6794780f71057e0d

2 years agoEX-6682 build: add json-c-devel into lustre-dkms.spec.in
Jian Yu [Tue, 10 Jan 2023 20:02:01 +0000 (12:02 -0800)]
EX-6682 build: add json-c-devel into lustre-dkms.spec.in

While installing client DKMS package, json-c-devel package is
required. This patch adds the package requirement into
lustre-dkms.spec.in.

Test-Parameters: trivial

Fixes: fbfd2d0755 ("EX-5176 pcc: use JSON string for trusted.pin xattr")
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Change-Id: I72f7e23a8c1ec9edecfc69b2e8dda758f215b4e2
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49594
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Feng Lei <flei@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-16444 enc: null-enc names cannot be digested form
Sebastien Buisson [Wed, 4 Jan 2023 15:10:02 +0000 (16:10 +0100)]
LU-16444 enc: null-enc names cannot be digested form

When encrypted files have their names encrypted, long names are in
digested form in case access is done without the encryption key. The
digest is base64-encoded, and prepended with '_'.
With null encryption for file names, names are always plain text. In
this case, a legitimate '_' at the start of a name must not be
interpreted as a digested form.

sanity-sec test_54 is improved to test the case of a file whose name
starts with '_'.

Lustre-change: https://review.whamcloud.com/49550
Lustre-commit: TBD (5487e006b1ca152be665729a4fdf273c6109f0f4)

Fixes: f18c87cb53 ("LU-13717 sec: handle null algo for filename encryption")
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Idaad186afd06cfbabbe1d13e78f083d12876c8ff
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49552
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 years agoLU-16160 revert: "llite: clear stale page's uptodate bit"
Andreas Dilger [Fri, 6 Jan 2023 18:19:33 +0000 (18:19 +0000)]
LU-16160 revert: "llite: clear stale page's uptodate bit"

This reverts commit 451b4ac514dd03c4fe91726da2f95a1f5575a5a6
which caused a bug in cl_page_own() race with ll_releasepage()
and cl_pagevec_put() assertion failure.

Lustre-change: https://review.whamcloud.com/49541
Lustre-commit: TBD (from ef330e09a59da0df2de153ecdb2e7d8729cd6b63)

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Icdb8c60f4d992c9976670e1b06c5bab5ef3a3954
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49576
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 years agoEX-6220 lipe: lipe_find3 has no mirror-count and stripe-count
Alexandre Ioffe [Sat, 7 Jan 2023 06:07:00 +0000 (22:07 -0800)]
EX-6220 lipe: lipe_find3 has no mirror-count and stripe-count

Added mirror-count and stripe-count search options

Test-Parameters: trivial testlist=sanity-lipe-find3
Signed-off-by: Alexandre Ioffe <aioffe@ddn.com>
Change-Id: I79c3b1cd0b1759abce248bee73676a823441825c
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49578
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
2 years agoLU-14645 tests: test lfs setdirstripe with '/$'
Jian Yu [Tue, 27 Dec 2022 01:13:31 +0000 (17:13 -0800)]
LU-14645 tests: test lfs setdirstripe with '/$'

This patch improves one of the lfs setdirstripe tests to
verify that dir name ending with '/' also works.

Lustre-change: https://review.whamcloud.com/49463
Lustre-commit: 4b9a39d3ed58a664a2498911ca1d3c9073c13bd3

Test-Parameters: trivial mdscount=2 mdtcount=4 \
env=ONLY=24B testlist=sanity

Change-Id: I237d5a9ebad42cc0569aa1db487d0df147372316
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49464
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoEX-6170 tests: make sanity-lipe-scan3.sh support remote MDS
Jian Yu [Thu, 5 Jan 2023 07:51:38 +0000 (23:51 -0800)]
EX-6170 tests: make sanity-lipe-scan3.sh support remote MDS

The sanity-lipe-scan3.sh script was written to only run on a local
client+MDS configuration. This patch fixes it to support running
the lipe_scan3 command on remote MDS.

Test-Parameters: trivial testlist=sanity-lipe-scan3
Test-Parameters: trivial mdscount=2 mdtcount=4 \
testlist=sanity-lipe-scan3

Signed-off-by: Jian Yu <yujian@whamcloud.com>
Change-Id: I0ede3420c6f529cdfb9e97a5664945a5c2f0ff09
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49559
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 years agoRM-620 build: New tag 2.14.0-ddn71
Andreas Dilger [Wed, 4 Jan 2023 23:07:16 +0000 (16:07 -0700)]
RM-620 build: New tag 2.14.0-ddn71

New tag 2.14.0-ddn71

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ia77c09c628c0b5f76497825f8e2142cd1bc6ad96

2 years agoLU-14645 utils: setstripe cleanup
Vitaly Fertman [Tue, 20 Dec 2022 19:50:47 +0000 (11:50 -0800)]
LU-14645 utils: setstripe cleanup

lfs setstripe checks stripe parameters differently for PFL and !PFL
layouts. Whereas the PFL layout is checked in comp_args_to_layout()
individually and in llapi_layout_sanity_cb() in pairs, !PFL layout
verification is done partially in several places. Create a common
llapi_stripe_param_verify() for this purpose. Make the checks for
both cases symmetric.

skip some excessive checks:
- do not check the file is on lustre fs, the following ioctl does it;
- do not check the stripe-index is valid, done on MDS side;
- do not check the pool exists for a !PFL file (align with a setstripe
  for PFL files);

Lustre-change: https://review.whamcloud.com/43465
Lustre-commit: 149934fe28dac22a51ec9b2873c4f215cb204947

Lustre-change: https://review.whamcloud.com/46151
Lustre-commit: 5e65d6a8e57a5a17c4c7e043cb46e86bf82b7782

Lustre-change: https://review.whamcloud.com/46152
Lustre-commit: cd1f8527d414a12ec7eb5b69fe30509a45b33ad4

Signed-off-by: Vitaly Fertman <c17818@cray.com>
Change-Id: I456b1b2e876229ac1a354d4e3879624325856574
HPE-bug-id: LUS-9886
Reviewed-on: https://es-gerrit.dev.cray.com/158589
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Alexander Boyko <c17825@cray.com>
Tested-by: Alexander Lezhoev <c17454@cray.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Andriy Skulysh <askulysh@gmail.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49459
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
2 years agoLU-16187 tests: Fix is_project_quota_supported()
Arshad Hussain [Mon, 26 Sep 2022 09:31:41 +0000 (15:01 +0530)]
LU-16187 tests: Fix is_project_quota_supported()

is_project_quota_supported() is called from sanity-quota.sh
to verify if the ldiskfs FS $ENABLE_PROJECT_QUOTAS is true
and to verify if current version of lfs command supports
'project'.  To do this it calls 'lfs --help' which is
not supported. This patch moves 'lfs --help' call to
'lfs --list-commands' call to verfiy if the present
version of lfs supports 'project'

Lustre-change: https://review.whamcloud.com/48654
Lustre-commit: d4848d779bb8716c6df2fe5438fbe00997f87f3d

Test-Parameters: trivial testlist=sanity-quota
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: Iba7e6696d3fa9e980088f448ae72b07a4b47f4f2
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49454
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sergey Cheremencev <scherementsev@ddn.com>
2 years agoEX-5819 tests: wait longer for lpcc.sock
Lei Feng [Fri, 30 Dec 2022 01:47:21 +0000 (09:47 +0800)]
EX-5819 tests: wait longer for lpcc.sock

wait a little longer for lpcc.sock in sanity-pcc/test_210.

Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: trivial testlist=sanity-pcc env=ONLY=210
Change-Id: I359782b5de86d7354df2db169f85a18490602d7d
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49531
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 years agoLU-16390 tests: check Lustre filefrag in sanity-flr/49a
Andreas Dilger [Tue, 13 Dec 2022 07:01:06 +0000 (00:01 -0700)]
LU-16390 tests: check Lustre filefrag in sanity-flr/49a

Check that a Lustre-patched filefrag is installed when running
sanity-flr test_49a.

Lustre-change: https://review.whamcloud.com/49386
Lustre-commit: 37f18670e49b8150170f9b724b5f7089fa176c4e

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ic909ea4ca160d47480004f53a96ce7539ce5076c
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Colin Faber <cfaber@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49503
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-16412 llite: check truncated page in ->readpage()
Qian Yingjin [Mon, 19 Dec 2022 06:57:39 +0000 (01:57 -0500)]
LU-16412 llite: check truncated page in ->readpage()

The page end offset calculation in filemap_get_read_batch() was
off by one. This bug was introduced in commit v5.11-10234-gcbd59c48ae
("mm/filemap: use head pages in generic_file_buffered_read")

When a read is submitted with end offset 1048575, it calculates
the end page index for read of 256 where it should be 255. This
results in the readpage() call for the page with index 256 is over
stripe boundary and may not be covered by a DLM extent lock.

This happens in a corner race case: filemap_get_read_batch()
batches the page with index 256 for read, but later this page is
removed from page cache due to the lock protected it being revoked,
but has a reference count due to the batch.  This results in this
page in the read path is not covered by any DLM lock.

The solution is simple. We can check whether the page was
truncated and was removed from page cache in ->readpage() by the
address_sapce pointer of the page. If it was truncated, return
AOP_TRUNCATED_PAGE to the upper caller.  This will cause the
kernel to retry to batch pages and the truncated page will not
be added as it was already removed from page cache of the file.

Add sanityn/test_95 to verify it.

Lustre-change: https://review.whamcloud.com/49433
Lustre-commit: TBD (from 02fe613db9517875c03e8a919e1b42cb1ba7c619)

Test-Parameters: testlist=sanityn env=ONLY=95 clientdistro=ubuntu2204
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I192df92b1d1b79057055430cc81cb7cc760cc9ed
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49434
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Zhenyu Xu <bobijam@hotmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 years agoLU-15115 ptlrpc: recalc timer on EINPROGRESS reply
Alexander Zarochentsev [Fri, 15 Oct 2021 18:27:29 +0000 (21:27 +0300)]
LU-15115 ptlrpc: recalc timer on EINPROGRESS reply

ptlrpcd doesn't recalculate wait queue timer after
getting -EINPROGRESS reply. It may delay request resend
till its timing out.

Lustre-change: https://review.whamcloud.com/45266
Lustre-commit: 9a5bace55a5ddb8a928af2de1b199e968f3fbecd

HPE-bug-id: LUS-10366
Signed-off-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Change-Id: Idc76c688a0f7ff8e110446fd1fe13dd83f636f3b
Reviewed-by: Andriy Skulysh <andriy.skulysh@hpe.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49513
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-16413 osd-ldiskfs: fix T10PI for CentOS 8.x
Li Dongyang [Mon, 19 Dec 2022 10:03:47 +0000 (21:03 +1100)]
LU-16413 osd-ldiskfs: fix T10PI for CentOS 8.x

Recreate the currently broken lustre kernel patches
to allow using custom integrity functions for bio.
Note we don't need to save the generate_fn anymore,
it will be used once we call bio_integrity_prep_fn().

Add upstream fix
b13e0c718568 ("block: bio-integrity: Advance seed correctly
for larger interval sizes") for CentOS 8.0 to 8.6.

Handle the kernel api changes for the T10PI generate and
verify functions introduced in CentOS 8.x kernel,
mostly because of switching to blk_integrity_iter.

Update the custom generate and verify functions, to sync
with upstream versions.
- Add T10-DIF-TYPE2, currently only a place holder,
  not used in upstream either.
- Use __be16 instead of __u16 for guard tags.

Only reuse guard tags if the rpc checksum is the same
one supported on the target. We already have some protection
during checksum type negotiation, the server
will mark the target's T10PI type as the only
T10PI checksum type supported. But it's still good to
have the logic in place.

Do not call bio_integrity_prep() if the custom interface
bio_integrity_prep_fn() does not exist, submit_bio() will
do that for us.

On the servers, show the target's T10PI checksum as
the preferred checksum_type even if it's not the fastest.
Note this is only cosmetic and does not impact the checksum
type used, which is still done during negotiation.

Lustre-change: https://review.whamcloud.com/49441
Lustre-commit: TBD (from a0c96829a760a5cf199e5278bf2693f2618b77c9)

Change-Id: I2d0ba0b80ba9cde2977da24db08095671aa5373c
Test-Parameters: trivial
Fixes: 293844d132 ("LU-16222 kernel: RHEL 8.7 client and server support")
Fixes: f176efd183 ("LU-12269 kernel: RHEL 8.0 server support")
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49483
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-16376 obdclass: NUL terminate long jobid strings
Andreas Dilger [Thu, 8 Dec 2022 18:43:57 +0000 (11:43 -0700)]
LU-16376 obdclass: NUL terminate long jobid strings

It appears that some jobid names can be sent that are using the full
32-byte size, rather than containing an embedded NUL terminator. This
caused errors in lprocfs_job_stats_log() when it overflowed.

If there is no NUL terminator in lustre_msg_get_jobid() then add one
if not found within the buffer, so that the rest of the code doesn't
have to deal with unterminated strings.

This potentially exposes a larger issue that other places may not be
handling the unterminated string properly either, which needs to be
addressed separately on both the client and server.  Terminating the
jobid to 31 chars only on the client does not totally solve the issue,
since there will still be older clients that are not doing this, so
the server needs to handle this in any case.

Lustre-change: https://review.whamcloud.com/49351
Lustre-commit: 9eba5d57297f807fddf046356c846478bbf232f4

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I4c05fabdacb6a0bbf6477d3601a628fe1f3ebbe5
Reviewed-by: Feng Lei <flei@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49501
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-14552 ptlrpc: NULL pointer dereference in ptlrpc_watchdog_fire
Andriy Skulysh [Mon, 1 Mar 2021 21:41:33 +0000 (23:41 +0200)]
LU-14552 ptlrpc: NULL pointer dereference in ptlrpc_watchdog_fire

thread->t_task isn't initialized by target_recovery_thread()

Lustre-change: https://review.whamcloud.com/43115
Lustre-commit: 14a1102268941d851ef5ef793923e39081b81ff4

Change-Id: Ia38d5ccaab6b9332a1fd60ebe5ed2461f7d5db84
HPE-bug-id: LUS-9748
Fixes: 0496cdf20 ("LU-13608 tgt: abort recovery while reading update llog")
Signed-off-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Alexander Boyko <c17825@cray.com>
Reviewed-by: Andrew Perepechko <c17827@cray.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49486
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-15081 vfs: set_nlink() is not race-safe
Andrew Perepechko [Mon, 11 Oct 2021 19:11:05 +0000 (22:11 +0300)]
LU-15081 vfs: set_nlink() is not race-safe

set_nlink() is not atomic wrt race with itself and
the following warning may be triggered by VFS:

WARNING: CPU: 5 PID: 195090 at fs/inode.c:241 __destroy_inode+0xdb/0xf0

It does not seem important what exact nlink value is the result
of the race. However, we need to protect the superblock remove
counter.

Lustre-change: https://review.whamcloud.com/45191
Lustre-commit: 12b05772fdb6d080819b6c213fcd7f8705278412

Signed-off-by: Andrew Perepechko <andrew.perepechko@hpe.com>
HPE-bug-id: LUS-9825
Change-Id: I67bc345b9a9e43fb88d919a83246759d11604b03
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49452
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoEX-6169 lipe: sanity-lipe-find3 reformat to clean lost+found
Alexandre Ioffe [Tue, 13 Dec 2022 06:20:33 +0000 (22:20 -0800)]
EX-6169 lipe: sanity-lipe-find3 reformat to clean lost+found

Reformat file system when .lustre/lost+found/ has garbage

Test-Parameters: trivial testlist=sanity-lipe-find3
Signed-off-by: Alexandre Ioffe <aioffe@ddn.com>
Change-Id: Ib78b06e685aaeabb8356662747285ed7a27dde15
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49385
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 years agoEX-6187 lipe: lipe_find3 missing option ipath
Alexandre Ioffe [Tue, 20 Dec 2022 07:35:13 +0000 (23:35 -0800)]
EX-6187 lipe: lipe_find3 missing option ipath

Added missing lexical ipath

Test-Parameters: trivial testlist=sanity-lipe-find3
Signed-off-by: Alexandre Ioffe <aioffe@ddn.com>
Change-Id: I62260e054a9c514aa31d378322b6840f75edf221
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49455
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Colin Faber <cfaber@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 years agoRM-620 build: New tag 2.14.0-ddn70
Andreas Dilger [Sat, 17 Dec 2022 02:30:27 +0000 (19:30 -0700)]
RM-620 build: New tag 2.14.0-ddn70

New tag 2.14.0-ddn70

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Iaf30877c3a4c88c20a64814670c7e97e4f0cc5e0

2 years agoLU-15935 tests: add version check to replay-dual test_33
Jian Yu [Wed, 14 Dec 2022 02:13:33 +0000 (18:13 -0800)]
LU-15935 tests: add version check to replay-dual test_33

This patch adds MDS version check to replay-dual test_33
to avoid interop test failure.

Lustre-change: https://review.whamcloud.com/49398
Lustre-commit: TBD (from 0027fba3d3f797407fad9f3995f839a431e49782)

Test-Parameters: trivial \
serverjob=lustre-b_es5_2 serverbuildno=539 \
env=ONLY=33 testlist=replay-dual

Test-Parameters: trivial env=ONLY=33 testlist=replay-dual

Change-Id: I3ec665302a431d3c0f07bc819a08237dbc5b4309
Fixes: 1a79d395dd ("LU-15935 target: keep track of multirpc slots in last_rcvd")
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49401
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 years agoLU-15234 lnet: add mechanism for dumping lnd peer debug info
Serguei Smirnov [Mon, 28 Feb 2022 19:04:00 +0000 (11:04 -0800)]
LU-15234 lnet: add mechanism for dumping lnd peer debug info

Add ability to dump lnd peer debug info:
lnetctl debug peer --nid=<nid>

The debug info is dumped to the log as D_CONSOLE by the respective
lnd and can be retrieved with "lctl dk" or seen in syslog.
This mechanism has been added for socklnd and o2iblnd peers.

Lustre-change: https://review.whamcloud.com/48566
Lustre-commit: 950e59ced18d49e9fdd31c1e9de43b89a0bc1c1d

Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: Ia9c4d59143206bcb7ec43806594cf0cfaed5f0a9
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49038
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 years agoEX-5924 lipe: lipe_scan3 ERROR replaced by WARNING
Alexandre Ioffe [Fri, 9 Dec 2022 22:52:55 +0000 (14:52 -0800)]
EX-5924 lipe: lipe_scan3 ERROR replaced by WARNING

Decrease severity of the message down to WARNING.

Test-Parameters: trivial testlist=sanity-lipe-find3
Signed-off-by: Alexandre Ioffe <aioffe@ddn.com>
Change-Id: I2f4b885248692e042ba9eb0f97736401e6d35de6
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49355
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Colin Faber <cfaber@ddn.com>
2 years agoLU-16378 lnet: handles unregister/register events
Cyril Bordage [Mon, 12 Dec 2022 10:49:11 +0000 (11:49 +0100)]
LU-16378 lnet: handles unregister/register events

When network is restarted, devices are unregistered and then
registered again. When a device registers using an index that is
different from the previous one (before network was restarted), LNet
ignores it. Consequently, this device stays with link in fatal state.

To fix that, we catch unregistering events to clear the saved index
value, and when a registering event comes, we save the new value.

Lustre-change: https://review.whamcloud.com/49375/
Lustre-commit: TBD (from 7442710a56a8f38453441c62253c0ad891fe9b8c)

Signed-off-by: Cyril Bordage <cbordage@whamcloud.com>
Change-Id: I17e93a1103d588f3e630a9c7446b345f4d472b97
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49376
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 years agoLU-16373 tests: failover mds1 back to the primary server
Jian Yu [Thu, 15 Dec 2022 19:38:56 +0000 (11:38 -0800)]
LU-16373 tests: failover mds1 back to the primary server

This patch fixes recovery-small test 144a to failover
mds1 back to the primary server so that stack_trap can
set timeout parameter on the correct mds node.

Lustre-change: https://review.whamcloud.com/49345
Lustre-commit: TBD (from 68c75d28fe86ac890d242c004c664f872204b660)

Test-Parameters: trivial \
env=SLOW=yes,FAILURE_MODE=HARD,ONLY=144a \
clientcount=4 mdtcount=1 mdscount=2 osscount=2 \
austeroptions=-R failover=true iscsi=1 \
testlist=recovery-small

Change-Id: Idbfdb7b084c7edac8784008e0455f76632aa685b
Test-Parameters: trivial testlist=recovery-small
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49419
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 years agoLU-16329 Revert "LU-8621 utils: cmd help to stdout or short cmd error"
Andreas Dilger [Thu, 15 Dec 2022 15:30:32 +0000 (08:30 -0700)]
LU-16329 Revert "LU-8621 utils: cmd help to stdout or short cmd error"

This reverts commit 608d763955d7e0a9c438c317e595f14825e9423b.
This breaks bash command completion.

Fixes: bc69a8d058 ("LU-8621 utils: cmd help to stdout, short cmd error")
Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I004ea5af499593b0f36ba17ff5f517548f0ea0f9
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49416
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoEX-6349 Revert "LU-14661 obdclass: Add peer/peer NI when processing llog"
Alex Zhuravlev [Wed, 14 Dec 2022 19:00:01 +0000 (22:00 +0300)]
EX-6349 Revert "LU-14661 obdclass: Add peer/peer NI when processing llog"

This reverts commit e8ddb2f550072cdd3489389c107af3e892a21f66.
It is causing problem with reconnection at failover.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I53594f8f93474666c4abd96291d58dadf8ac5969
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49411
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-15643 osd-ldiskfs: don't trigger scrub on irreparable FIDs
Lai Siyao [Tue, 15 Mar 2022 19:43:14 +0000 (15:43 -0400)]
LU-15643 osd-ldiskfs: don't trigger scrub on irreparable FIDs

In osd_fid_lookup(), if the FID mapping found in OI table is insane,
it will be added into a list called os_inconsistent_items, and OI
scrub will be triggered.

Later if OI scrub can't fix this mapping, it should move this mapping
into a list called os_stale_items, and subsequent access of the same
FID should return -ESTALE immediately, other than trigger OI
scrub repeatedly.

Add sanity-scrub 20. Remove sanity-scrub 1d, which is not a sane test
because it altered FID in LMA, which is the last to trust for an
object, and it could pass just by chance.

Lustre-change: https://review.whamcloud.com/46852
Lustre-commit: 558784caad491be50e93ae60a31d4219a1e038bc

Test-Parameters: mdscount=2 mdtcount=4 testlist=sanity-scrub,sanity-scrub,sanity-scrub,sanity-scrub,sanity-scrub,sanity-scrub,sanity-scrub,sanity-scrub
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I3ed8928506551416b1008121adbe385dedda29bc
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49424
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoRM-620 build: New tag 2.14.0-ddn69
Andreas Dilger [Tue, 13 Dec 2022 19:12:09 +0000 (12:12 -0700)]
RM-620 build: New tag 2.14.0-ddn69

New tag 2.14.0-ddn69

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I592bd3a6fdb9db02bbe1a18c6e84d9b61a639f95

2 years agoEX-6497 lipe: Refine stats field name in lamigo
Alexandre Ioffe [Thu, 8 Dec 2022 06:45:35 +0000 (22:45 -0800)]
EX-6497 lipe: Refine stats field name in lamigo

Corrected periodically printed by lamigo INFO
message "processed":
- Added two additional fields:
  "running" - number of currently running jobs such as replication
  "delayed" - current number of failed and other (such as set flag)
  jobs which are awating to be run on next lamigo cycle
- "in queue" field is changed to "awaiting". This is current number
  of files in the internal cache. These files are awating to be
  processed (replicated)

Test-Parameters: trivial testlist=hot-pools
Signed-off-by: Alexandre Ioffe <aioffe@ddn.com>
Change-Id: Iacf0199cfcf56edcbb8ad91e0e4b62c7451900f5
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49344
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Colin Faber <cfaber@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 years agoEX-6298 lipe: decrease delay before ALR restart
Alexandre Ioffe [Sat, 19 Nov 2022 05:39:08 +0000 (21:39 -0800)]
EX-6298 lipe: decrease delay before ALR restart

- Decrease delay before restarting access log reader and
eliminate this delay when the read from ALR fails
due to timeout. Increase SSH poll/read timeout while
keep-alive message in ofd_access_log_reader is not
implemented
This will decrease probability of missing ALR.
- Remove excluding hot-pools test_72

Test-Parameters: trivial testlist=hot-pools
Signed-off-by: Alexandre Ioffe <aioffe@ddn.com>
Change-Id: I36989e9c3fd877aee5ce1cfb8525db8604e666bd
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49196
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 years agoLU-16353 config: enable_foo variables mustn't contains space
Mr NeilBrown [Thu, 1 Dec 2022 17:53:01 +0000 (09:53 -0800)]
LU-16353 config: enable_foo variables mustn't contains space

$enable_crypto is in some circumstances set to "embedded llcrypt"
which contains a space.
When the code from lustre-build.m4 then tests the value with:

   if test x$enablecrypto = xyes

we get a syntax error from ./configure

We could add quotes to this comment, but for consistency we would need
to add quotes to ever other test for an enable_foo variable.

It is simpler just to ensure we don't add spaces.  So change the space
to a hyphen.

Lustre-change: https://review.whamcloud.com/49282
Lustre-commit: c8a33e5322b0675680f8d737f04259799d30aa0e

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I097e857409d6ec48a765ccda1cc470d28b90e601
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49295
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 years agoLU-16051 o2iblnd: detect link state to set fatal error on ni
Serguei Smirnov [Fri, 23 Sep 2022 22:20:51 +0000 (15:20 -0700)]
LU-16051 o2iblnd: detect link state to set fatal error on ni

To avoid selecting lnet ni which corresponds to a downed link
for sending, add a mechanism for detecting ip-layer link events
in o2iblnd. On ip link up/down events, find corresponding
ni and toggle ni_fatal_error_on flag. This complements the
existing mechanism for ib-layer link event handling.

Lustre-change: https://review.whamcloud.com/48644
Lustre-commit: 30d73908087d5b2f0b18cce95826c4825c030ad4

Test-Parameters: trivial
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: I4720cd0a7bc577a522c7d40b54f821a4c12b670f
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49315
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 years agoLU-14992 tests: add more mkdir_on_mdt0 calls
Mr NeilBrown [Tue, 29 Nov 2022 02:31:21 +0000 (18:31 -0800)]
LU-14992 tests: add more mkdir_on_mdt0 calls

A previous patch changed some mkdir calls in test_133a to
mkdir_on_mdt0. This allows stats collected from mdt0 to
reflect the mkdir.

However two mkdir calls were missed, so "crossdir_rename" stats can be
wrong.

Lustre-change: https://review.whamcloud.com/49252
Lustre-commit: d56ea0c80a959ebd9b393f2da048cc179cb16127

Test-Parameters: trivial mdscount=2 mdtcount=4 testlist=sanity env=ONLY=133a

Fixes: 543341afc3 ("LU-14992 tests: sanity/replay-vbr mkdir on MDT0")
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I4e5c2e5504307462bff4012a13ef9deb24f8da8c
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49262
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 years agoLU-16308 llite: wake_up after cl_object_kill
Lai Siyao [Thu, 10 Nov 2022 13:15:51 +0000 (08:15 -0500)]
LU-16308 llite: wake_up after cl_object_kill

cl_inode_fini() calls cl_object_kill() to set LU_OBJECT_HEARD_BANSHEE,
and then calls cl_object_put_last() to wait for object refcount to
become one, It should wake_up() in the middle in case someone is
waiting on the flag.

Lustre-change: https://review.whamcloud.com/49130
Lustre-commit: 3a0a6c7a88499a78c9bfc6ac514d05eba60312c9

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I244db71ee4ed9c39118e443b99c3b8a3a0aa4bc3
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49312
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 years agoEX-6468 pcc: add threshold to determine direct I/O during attach
Qian Yingjin [Wed, 30 Nov 2022 14:29:47 +0000 (09:29 -0500)]
EX-6468 pcc: add threshold to determine direct I/O during attach

This patch adds the threshold tunable parameter to determine doing
direct I/O or buffered I/O for data copying during attach:
llite.*.pcc_dio_attach_threshold
The default value is same as direct I/O size: 32MiB.

And the usage of the parameter "pcc_dio_attach_size_mb" is
deprecated, and use "pcc_dio_attach_iosize_mb" instead.

Change-Id: I393d6a06523303e749192ba9978449c3d75886ae
Signed-off-by: Qian Yingjin <qian@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49286
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Feng Lei <flei@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 years agoRM-620 build: New tag 2.14.0-ddn68
Andreas Dilger [Tue, 6 Dec 2022 05:15:41 +0000 (22:15 -0700)]
RM-620 build: New tag 2.14.0-ddn68

New tag 2.14.0-ddn68

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Id4e3d1a9f28afe251e55582c84acaf98ebfe9954

2 years agoLU-15852 lnet: Don't modify uptodate peer with temp NI
Chris Horn [Wed, 30 Mar 2022 18:35:23 +0000 (13:35 -0500)]
LU-15852 lnet: Don't modify uptodate peer with temp NI

When processing the config log it is possible that we attempt to
add temp NIs after discovery has completed on a peer. These temp
may not actually exist on the peer. Since discovery has already
completed the peer is considered up-to-date and we can end up with
incorrect peer entries. We shouldn't add temp NIs to a peer that
is already up-to-date.

Lustre-change: https://review.whamcloud.com/47322
Lustre-commit: 8f718df474e453fbc69dfe90214e71565963f6db

Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Ia484713b1e6c9e1a46e525589b7c741c6478e417
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49303
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-15938 llog: more checks in llog_reader
Mikhail Pershin [Tue, 2 Aug 2022 12:41:52 +0000 (15:41 +0300)]
LU-15938 llog: more checks in llog_reader

Add more correctness checks and reports in llog_reader:
- better report wrong record length and chunk skipping case
- add tail check: tail id and len should be the same as in head
- better report for gap in record indeces
- test case with two corruption types:
  1) llog has bits set in bitmap beyond file end
  2) corruption in the middle

Lustre-change: https://review.whamcloud.com/48112
Lustre-commit: 386ffcdbb4c9b89f798de4c83a51a3f020542c8b

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I0c2af6ae2592c94e14e90ead12e28104409313b2
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49214
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
2 years agoLU-16317 build: dkms build requires flex, bison and libmount-devel
Jian Yu [Tue, 29 Nov 2022 17:14:22 +0000 (09:14 -0800)]
LU-16317 build: dkms build requires flex, bison and libmount-devel

This patch fixes lustre.spec.in and lustre-dkms.spec.in to add
requires for flex, bison, libmount and libmount-devel. The last
two have already been added into lustre.spec.in.

Lustre-change: https://review.whamcloud.com/49183
Lustre-commit: c74c630ff7596317d1b500fd385fca271b31708c

Test-Parameters: trivial

Fixes: 121a79651f ("LU-15967 build: configure script does not check for required build tools")
Fixes: f21b944127 ("LU-15940 build: add a required dependency for libmount")

Change-Id: I9923fc7eb09f974e8c38c3664138486a424e16d7
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49275
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoEX-6373 pcc: asynchronous PCCRO attach command support
Qian Yingjin [Fri, 11 Nov 2022 09:01:02 +0000 (04:01 -0500)]
EX-6373 pcc: asynchronous PCCRO attach command support

Currently PCCRO attach via the command "lfs pcc attach" will block
during the data copying.
There is a requirement that this command can also do data copy
asynchronously. Thus we add an option "--async|-A" to the command
which will not block while the file data is being fetched.

Add sanity-pcc/test_{103, 104} to verify that it works correctly.

Change-Id: I6f31190c8b9e9b9876b34f8e484c6c8b7f16b6db
Signed-off-by: Qian Yingjin <qian@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49133
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Feng Lei <flei@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 years agoLU-16313 pcc: use two bits to indicate pcc type for attach
Qian Yingjin [Tue, 15 Nov 2022 06:57:08 +0000 (01:57 -0500)]
LU-16313 pcc: use two bits to indicate pcc type for attach

PCC currenty supports two types: readwrite and readonly.
The attach data structure @lu_pcc_attach is using 32 bit value to
indicate the PCC type:
struct lu_pcc_attach {
__u32 pcca_type;
__u32 pcca_id;
};

In this patch, it changes to use 2 bits to represent the PCC type.
The left bits in @pcca_type can be used as flags for attach such
as a flag to indicate using the asynchronous attach via the
command "lfs pcc attach -A" for PCCRO.

Lustre-change: https://review.whamcloud.com/49160
Lustre-commit: 6e90974b1f4ac24c5a5d45ecc9bdb4d47018dab4

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: Idee26018642a174b04d1d36a81952ea98a06514e
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49163
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoRM-620 build: New tag 2.14.0-ddn67
Andreas Dilger [Tue, 6 Dec 2022 02:05:39 +0000 (19:05 -0700)]
RM-620 build: New tag 2.14.0-ddn67

New tag 2.14.0-ddn67

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ia40ed3b7d185fa171586d5ca377714518fdc5e2e

2 years agoLU-8585 llapi: use open_by_handle_at in llapi_open_by_fid
Quentin Bouget [Sun, 2 Jan 2022 16:12:42 +0000 (11:12 -0500)]
LU-8585 llapi: use open_by_handle_at in llapi_open_by_fid

Reimplement llapi_open_by_fid() to use llapi_fid_to_handle() and
open_by_handle_at(2) rather than using ioctl().  This works for
opens on subdirectory mountpoints, unlike ".lustre/fid/<fid>".

This patch also adds llapi_open_by_fid_at() which is similar to
llapi_open_by_fid() except that it takes an open directory file
descriptor or AT_CWD rather than a path as its first argument.

[AD:
- Move get_root_*() functions over to a new liblustreapi_root.c
  file in expectation of further enhancements to that code.
- Cache an open file handle on the root directory so repeated
  calls to llapi_open_by_fid() and llapi_fid2path() do not need
  to search for and open the same root directory path many times.
- Add man pages for newly-added functions.

  This reduces the system calls for llapi_fid_test significantly:

      original     patched
         14511        4315   total opens
         64807       34067   total syscalls
]

There may still be a need to have a fallback from open_by_handle_at()
to using ".lustre/fid/<FID>" to open the fid (if available), but
that can be added if this initial patch does not test well.  The
open_by_handle_at() method avoids reopening the "fid/" directory
each time (though this fd could also be cached), but it has the
drawback that it reconnects dentries to the root directory each time.

Lustre-change: https://review.whamcloud.com/36603
Lustre-commit: bdf7788d19985bb7abf2385add15f1d67f3d01e4

Signed-off-by: Quentin Bouget <quentin.bouget@cea.fr>
Change-Id: I8a4904c996389da2b0894cd9fac639a398607535
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49202
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-15833 llapi: don't use realpath in llapi_search_fsname()
Etienne AUJAMES [Mon, 9 May 2022 13:44:29 +0000 (15:44 +0200)]
LU-15833 llapi: don't use realpath in llapi_search_fsname()

This patch use st_dev value to determine the fsname in
llapi_search_fsname().
The main purpose of this is to limit the number of lstat()
(realpath()) in this function.

get_root_path() is modified to search a mountpoint by dev.
And the last results of get_root_path() is cached to avoid reading
/proc/mount for each call.

A new api function llapi_search_rootpath_by_dev() is added to get
the path of Lustre mountpoint using the specified device value.

**Testing:**

*Environement:*
VMs: 1 client, 1 MDS (2MDT), 1 OSS (2 OST)
Lustre tree: test{001..100}/test{001..100}/test{01..10}/file{01..05}
(500000 files + 110100 folders)
OS: Centos 7 (no statx)
Lustre: 2.15.50_15_g1116739

*Tests*
cd <rootfs>
strace lfs getstripe -r .
echo 3 > /proc/sys/vm/drop_caches
/usr/bin/time lfs getstripe -r . (2 iterations)

*Results*
times (s):

                 ______________________________
                | user | system | real | real% |
 _______________|______|________|______|_______|
|without patch: | 6.18 | 57.3   | 427  | 0%    |
|_______________|______|________|______|_______|
|with patch:    | 2.88 | 47.3   | 404  |-5.45% |
|_______________|______|________|______|_______|

strace (only significant changes are displayed):
(*stat = lstat + stat + fstat)
                 _____________________________________________
                | *stat  | mmap   | open   | read   | all     |
 _______________|________|________|________|________|_________|
|without patch: | 760545 | 110142 | 330379 | 330325 | 4742658 |
|_______________|________|________|________|________|_________|
|with patch:    | 440484 | 0      | 220277 | 19     | 3541739 |
|_______________|________|________|________|________|_________|

-25.32% syscalls after patching.

Lustre-change: https://review.whamcloud.com/47258
Lustre-commit: 4fd7d5585d33240a658f57bf7399da4415a7eb6c

Signed-off-by: Etienne AUJAMES <etienne.aujames@cea.fr>
Change-Id: I3812d922d5b1d194d52132cba95d11820424c5d7
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49201
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
2 years agoDDN-3473 build: support kernel 3.10.0-693.el7
Jian Yu [Wed, 16 Nov 2022 05:54:26 +0000 (21:54 -0800)]
DDN-3473 build: support kernel 3.10.0-693.el7

This patch fixes the following build failures to support
kernel 3.10.0-693.el7 for Lustre client:

- error: implicit declaration of function 'idr_destroy'
- error: implicit declaration of function 'gfpflags_allow_blocking'
- error: implicit declaration of function ‘cdev_device_add’
- error: passing argument 1 of 'init_wait_var_entry' from
  incompatible pointer type

Test-Parameters: trivial
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Change-Id: I4b5c5264fb102d3a825c92e7b1e92cf0c52540e5
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49197
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 years agoLU-12016 tests: skip sanity/156 in interop
Andreas Dilger [Fri, 25 Nov 2022 02:28:21 +0000 (19:28 -0700)]
LU-12016 tests: skip sanity/156 in interop

Since LU-12071 was backported to b_es5_2 the version check on b_es6_0
is incorrect and this part of the test_156 should be skipped.

Test-Parameters: trivial testlist=sanity env=ONLY=156 serverversion=EXA5
Fixes: 3043c6f189 ("LU-12071 osd-ldiskfs: bypass pagecache if requested")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I3fd96578e36675655fb265d83ba3f661950ab112
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49246
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-15139 osp: block reads until the object is created
Alex Zhuravlev [Sun, 13 Nov 2022 14:51:30 +0000 (17:51 +0300)]
LU-15139 osp: block reads until the object is created

it's possible that remote llog can be read and written simultaneously
at recovery. for example, dtx recovery thread is fetching updates
while MDD's orphan cleanup procedure is removing orphans from PENDING.

OSP can be asked to read a just created in OSP cache object while
actual object on remote MDS hasn't been created yet. OSP should
block such reads until the creation is done.

Lustre-change: https://review.whamcloud.com/47003/
Lustre-commit: 4f2914537cc32fe89c4781bcfc87c38e3fe4419c

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I5596c791a758dd542746afd961eb1ed9c97845be
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49146
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 years agoLU-16295 kernel: kernel update RHEL 7.9 [3.10.0-1160.80.1.el7]
Jian Yu [Fri, 18 Nov 2022 20:13:08 +0000 (12:13 -0800)]
LU-16295 kernel: kernel update RHEL 7.9 [3.10.0-1160.80.1.el7]

Update RHEL 7.9 kernel to 3.10.0-1160.80.1.el7.

Lustre-change: https://review.whamcloud.com/49045
Lustre-commit: TBD (from 636e97a22936a1fab8d9e5fde40f6e1f9a1c5bc5)

Test-Parameters: trivial clientdistro=el7.9 serverdistro=el7.9

Signed-off-by: Jian Yu <yujian@whamcloud.com>
Change-Id: I50a0ee572d24ddc73f8af6dc32ef701c260e45b7
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49194
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 years agoLU-6399 pcc: add tunable parameter for PCC attach thread
Qian Yingjin [Wed, 16 Nov 2022 09:26:33 +0000 (04:26 -0500)]
LU-6399 pcc: add tunable parameter for PCC attach thread

Currently the max number of kernel threads doing asynchronous
attach is a hard code value (1024 by default).
In this patch, we make it a tunable parameter:
llite.*.pcc_max_attach_thread_num

Change-Id: Ic59c15af935dd8dff586fa6be3939d4322c136d5
Signed-off-by: Qian Yingjin <qian@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49168
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 years agoEX-6372 lipe: Remove colocation constraint from lamigo/lpurge resources
Gaurang Tapase [Fri, 11 Nov 2022 18:26:30 +0000 (23:56 +0530)]
EX-6372 lipe: Remove colocation constraint from lamigo/lpurge resources

We now rely on node attribute *-recovered to start HP resources.
Hence, starting ES 5.2.7 colocation constraints are not needed
to start resources. Moreover, with the rules added, base FS
target resources cannot start on the designated nodes as node
get -inf score. This prevents resources failback in case original
server comes back up after failover.

Test-Parameters: trivial

Signed-off-by: Gaurang Tapase <gtapase@ddn.com>
Change-Id: I890b12bf8a0d75d618a041be1eb27960dc62cc7e
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49179
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Artur Novik <anovik@ddn.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 years agoLU-16160 llite: clear stale page's uptodate bit
Bobi Jam [Tue, 20 Sep 2022 16:27:04 +0000 (00:27 +0800)]
LU-16160 llite: clear stale page's uptodate bit

With truncate_inode_page()->do_invalidatepage()->ll_invalidatepage()
call path before deleting vmpage from page cache, the page could be
possibly picked up by ll_read_ahead_page()->grab_cache_page_nowait().

If ll_invalidatepage()->cl_page_delete() does not clear the vmpage's
uptodate bit, the read ahead could pick it up and think it's already
uptodate wrongly.

In ll_fault()->vvp_io_fault_start()->vvp_io_kernel_fault(), the
filemap_fault() will call ll_readpage() to read vmpage and wait for
the unlock of the vmpage, and when ll_readpage() successfully read
the vmpage then unlock the vmpage, memory pressure or truncate can
get in and delete the cl_page, afterward filemap_fault() find that
the vmpage is not uptodate and VM_FAULT_SIGBUS got returned. To fix
this situation, this patch makes vvp_io_kernel_fault() restart
filemap_fault() to get uptodated vmpage again.

Lustre-change: https://review.whamcloud.com/48607
Lustre-commit: 5b911e03261c3de6b0c2934c86dd191f01af4f2f

Test-Parameters: testlist=sanityn env=ONLY="16f",ONLY_REPEAT=50
Test-Parameters: testlist=sanityn env=ONLY="16g",ONLY_REPEAT=50
Test-Parameters: testlist=sanityn env=ONLY="16f 16g",ONLY_REPEAT=50
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I369e1362ffb071ec0a4de3cd5bad27a87cff5e05
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49131
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>