Whamcloud - gitweb
fs/lustre-release.git
16 months agoRM-620 build: New tag lipe-2.39
Andreas Dilger [Tue, 23 Jan 2024 02:09:05 +0000 (19:09 -0700)]
RM-620 build: New tag lipe-2.39

New tag lipe-2.39

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I91f46bb7da4d5fd6d06faac7c8c9975c69d7e8ce

16 months agoLU-17441 mdc: use MDS_IO_PORTAL for rename
Andreas Dilger [Thu, 18 Jan 2024 09:49:48 +0000 (02:49 -0700)]
LU-17441 mdc: use MDS_IO_PORTAL for rename

Some workloads like Apache Spark are very rename intensive, and there
here may be many concurrent renames that need the BFL lock (more than
the number of MDS_REQUEST_PORTAL service threads), they will block
these threads until each is able to get the rename lock, and prevent
other MDS_REINT RPCs from being processed.

Since the MDS_IO_PORTAL is often unused (only needed for DoM files),
and has existed since 2.11.0, it seems possible to move the rename
RPCs to be serviced by the MDS_IO_PORTAL threads to avoid contention
on the primary MDS service threads. Also, it will avoid blocking
normal file open, setattr, statfs, and other common operations if the
BFL lock is contended. Even with DoM files they may have read-on-open
handling and only DoM writes would be blocked by the uncommon rename.

Lustre-change: https://review.whamcloud.com/53725
Lustre-commit: TBD (from b31c07cf18882b150d3e49ceee85a187e7a9b159)

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I623a27de1482778f3c9fc6bb5bbcf917611dc75b
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53749
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
16 months agoEX-8311 csdc: allow specify 'fast'/'best' compression type
Bobi Jam [Tue, 24 Oct 2023 14:02:55 +0000 (22:02 +0800)]
EX-8311 csdc: allow specify 'fast'/'best' compression type

Use lctl set_param osc.*.compress_type_{fast|best}=<type>:<level>
to specify the compression_type:level for LL_COMPR_TYPE_FAST/
LL_COMPR_TYPE_BEST.

lctl get_param osc.*.compress_type_{fast|best} will list these
values.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Ifeff7f25e30fc0884f0c770a3b6d0798937b3c35
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52814
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
16 months agoLU-15288 lnet: increase transaction timeout
Cyril Bordage [Tue, 7 Dec 2021 22:14:43 +0000 (23:14 +0100)]
LU-15288 lnet: increase transaction timeout

In LU-13145, it was decided to increase default transaction timeout
(LNET_TRANSACTION_TIMEOUT_DEFAULT) to 150s. But, in the associated
patch, it was set to 50s. This modification will also modify
lnd_timeout (from 16 to 49).

Lustre-change: ttps://review.whamcloud.com/45780
Lustre-commit: 18b4e28f18d55291f8a14a3bd9ee84b1a686a93e

Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Cyril Bordage <cbordage@whamcloud.com>
Change-Id: I13a8b5d14230bb6e8936cb3e18540f19dbc62985
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53747
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
16 months agoLU-16913 revert "EX-7849 quota: extra debug messages"
Andreas Dilger [Fri, 19 Jan 2024 19:28:05 +0000 (19:28 +0000)]
LU-16913 revert "EX-7849 quota: extra debug messages"

This reverts commit 02242f6f1ba1867756ee5b91abd2207f646436cf.
Extra debugging is no longer needed.

Change-Id: I083b70a911ac85fb5a1054c8409146bb393e94b0
Test-Parameters: trivial testlist=sanity-quota
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53746
Reviewed-by: Sergey Cheremencev <scherementsev@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
16 months agoLU-12031 tests: update interop version checks
Andreas Dilger [Mon, 22 Jan 2024 16:40:20 +0000 (16:40 +0000)]
LU-12031 tests: update interop version checks

Update the version check in sanity test_270j and
sanity-hsm test_1f to match actual landing version.

Change-Id: Ifd6d5dec50e3fcbb7ebe77ab41335a6c3994ef57
Test-Parameters: trivial
Fixes: 3bccd95ca2 ("LU-12031 mdt: explicit data version of DoM files")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53762
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
16 months agoEX-8042 lipe: Fix size calculation when using -blocks option
Vitaliy Kuznetsov [Thu, 18 Jan 2024 11:31:36 +0000 (12:31 +0100)]
EX-8042 lipe: Fix size calculation when using -blocks option

This patch fixes the size calculation in the "-blocks"
option when specifying the exact size value "n[bkMG]".

Test-Parameters: trivial testlist=sanity-lipe-find3
Signed-off-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Change-Id: I5dd0ce69cef20ab9a9632798f350cf4c9f96cf25
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53723
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
16 months agoDDN-4623 obdclass: fix upcall_cache_get_entry
Sebastien Buisson [Fri, 19 Jan 2024 17:09:14 +0000 (18:09 +0100)]
DDN-4623 obdclass: fix upcall_cache_get_entry

When an entry is found while holding the read lock, we need to
convert to a write lock and find again, to check that entry was
not modified/freed in between.
In this case, the variable indicating an entry was found must be
reset, because we might not find any valid entry after all.

Fixes: 127128bed3 ("LU-16498 obdclass: change uc_lock to rwlock")
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I111af4562ac78eeb22102a8a28943e46e30b4019
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53743
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
17 months agoRM-620 build: New tag 2.14.0-ddn130
Andreas Dilger [Thu, 18 Jan 2024 09:29:17 +0000 (02:29 -0700)]
RM-620 build: New tag 2.14.0-ddn130

New tag 2.14.0-ddn130

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Iff6822b7d54b0cf9e1946bccf20069fa2ec51e3e

17 months agoEX-1878 lipe: resync all stale files
Alexandre Ioffe [Fri, 8 Dec 2023 22:17:37 +0000 (14:17 -0800)]
EX-1878 lipe: resync all stale files

Add --resync-all-stales option (default is on).
This option allows lamigo by default to resync
all files if any component of a file
is stale regardless to pool or OST location.
If the option is off, lamigo works the old way

Test-Parameters: trivial testlist=hot-pools
Signed-off-by: Alexandre Ioffe <aioffe@ddn.com>
Change-Id: Ibc26a21fa99f75de87a8e0328b183d96b7548c1f
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53391
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
17 months agoEX-8353 csdc: rename "cp_comp_*" to "cp_compr_*"
Jian Yu [Wed, 17 Jan 2024 01:10:58 +0000 (17:10 -0800)]
EX-8353 csdc: rename "cp_comp_*" to "cp_compr_*"

This patch renames "cp_comp_type", "cp_comp_level",
and "cp_chunk_log_bits" to use "compr" in the name
to be consistent with other variable names.

Test-Parameters: trivial

Change-Id: I428ff3a789b33da02832dee02f316b02d97137e2
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52761
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
17 months agoEX-8993 ofd: use niocount consistently
Patrick Farrell [Tue, 21 Nov 2023 20:25:34 +0000 (15:25 -0500)]
EX-8993 ofd: use niocount consistently

'niocount' refers to the number of remote niobufs, ie, the
number of separate IOs from the client.  Except for a few
places, where it's used to refer to the number of pages in
the entire RPC.  Eek.

Replace this usage with 'npages', making niocount usage
consistent.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I266087ad8dccadb54c054b0a11fb03dc9868a725
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53206
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
17 months agoEX-8971 pcc: add lctl pcc abort command to abort attaches
Qian Yingjin [Fri, 12 Jan 2024 09:03:15 +0000 (04:03 -0500)]
EX-8971 pcc: add lctl pcc abort command to abort attaches

This patch adds a new PCC command "lctl pcc abort [--wait|-w]
[--detach|-d] $LUSTRE_MNTPT $PCCROOT".
--wait|-w: wait all in-flight attaches aborted.
--detach|-d: detach the PCC copies when scan the PCC backend.

It can be used to abort in-progress attaches for a given PCC
backend. It does not remove the PCC backend from a client.

Add sanity-pcc/test_109 to verify it.

Change-Id: Ib7152f7418aa1beb840919e98bf8de53c99b5c54
Signed-off-by: Qian Yingjin <qian@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53656
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
17 months agoLU-17370 utils: simplify lfs-mirror-extend help text
Alexandre Ioffe [Fri, 5 Jan 2024 04:54:11 +0000 (20:54 -0800)]
LU-17370 utils: simplify lfs-mirror-extend help text

Add list of lfs setstripe command line options
to help text of lfs mirror extend.
Simplify syntax of lfs mirror extend help text.
Update corresponding lfs-mirror-extend man page.
On man pages make left side adjustment and disable hyphenation:
'.nh', '.ad l' to prevent hyphenation of keywords

Lustre-change: https://review.whamcloud.com/53719
Lustre-commit: TBD (from 2a71d159d4ac98a3252f12796b8688bfa4d6df50)

Signed-off-by: Alexandre Ioffe <aioffe@ddn.com>
Test-Parameters: trivial
Change-Id: I6cffcdb9651062e169f53868827646b876a82cb5
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53598
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
17 months agoEX-8851 lustre: add uncompressed size to compression header
Patrick Farrell [Wed, 20 Dec 2023 19:51:55 +0000 (14:51 -0500)]
EX-8851 lustre: add uncompressed size to compression header

It's useful to have the uncompressed size of the data in the
compression header.  Also, we have three checksum fields -
compressed, uncompressed, and header, but in practice,
checksumming the compressed data including the header is
enough to cover all of these.

This patch cleans up all of this at the same time.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ie82e0dbe9c862ddc88999b109cea1f27577dbbff
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53520
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
17 months agoLU-17218 ofd: improve filter_fid upgrade compatibility
Bobi Jam [Mon, 23 Oct 2023 07:29:07 +0000 (15:29 +0800)]
LU-17218 ofd: improve filter_fid upgrade compatibility

filter_fid could be expanded in later Lustre version, and with
upgrade then downgrade process, the filter_fid EA on disk
could has been expanded during upgrade, and won't work after
the downgrade.

This patch improves this process by allocating bigger buffer to
hold the expanded filter_fid EA then trims the unrecognizable
fileds off.

Lustre-change: https://review.whamcloud.com/52798
Lustre-commit: TBD (from cffd0a099c30794a63268da008958f722882119b)

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I4c99f1d9f3962d46ebf9e9b799988ff3dba4f919
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53662
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
17 months agoLU-16637 llite: tolerate fresh page cache pages after truncate
Andrew Perepechko [Tue, 26 Dec 2023 17:02:12 +0000 (20:02 +0300)]
LU-16637 llite: tolerate fresh page cache pages after truncate

Truncate called by ll_layout_refesh() can race with a fast read
or tiny write, which can add an uninitialized non-uptodate page
into the page cache.

We want to avoid expensive locking for this rare case so if there
is any leftover in the cache after truncate, just check that
the pages are not uptodate, not dirty and do not have any
filesystem-specific information attached to them.

Lustre-change: https://review.whamcloud.com/53554
Lustre-commit: TBD (from f4c8d44a7c2f0fbc2c74d1832ff63c5216c22c38)

Change-Id: I8cadc022a3d1822a585f32e1a765e59ad0ff434d
Signed-off-by: Andrew Perepechko <andrew.perepechko@hpe.com>
HPE-bug-id: LUS-11937
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53611
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
17 months agoLU-17364 llite: don't use stale page.
Alexey Lyashkov [Fri, 12 Jan 2024 18:55:55 +0000 (13:55 -0500)]
LU-17364 llite: don't use stale page.

using stale page for write might confuse a read path,
which expect any IO page have PG_uptodate flag set,
and it caused an panic with removing from IO.

Lustre-Change: https://review.whamcloud.com/53550
Lustre-Commit: TBD (from f7b42523e669d3653ca7c442fe82afde618bbdd5)

Test-Parameters: testlist=sanityn env=SLOW=yes,ONLY=16k,ONLY_REPEAT=10
Test-Parameters: testlist=sanityn env=SLOW=yes,ONLY=16k,ONLY_REPEAT=10
Test-Parameters: testlist=sanityn env=SLOW=yes,ONLY=16k,ONLY_REPEAT=10
Test-Parameters: testlist=sanityn env=SLOW=yes,ONLY=16k,ONLY_REPEAT=10
Test-Parameters: testlist=sanityn env=SLOW=yes,ONLY=16k,ONLY_REPEAT=10
Test-Parameters: testlist=sanityn env=SLOW=yes,ONLY=16k,ONLY_REPEAT=10
Signed-off-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Change-Id: Ia01129ceaecf53d8d9f301c26cd2d65122f6a267
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53666
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
17 months agoLU-16498 obdclass: fix write unlock for internal case
Sebastien Buisson [Mon, 15 Jan 2024 08:57:53 +0000 (09:57 +0100)]
LU-16498 obdclass: fix write unlock for internal case

Holding a (write) lock is mandatory for put_entry(), so fix that in
refresh_entry_internal().

Fixes: 127128bed3 ("LU-16498 obdclass: change uc_lock to rwlock")
Test-Parameters: kerberos=true testlist=sanity-krb5
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: If55182ca29f37f2a783fdb88ba46512944a61c47
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53674
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
17 months agoRM-620 build: New tag 2.14.0-ddn129
Andreas Dilger [Sat, 13 Jan 2024 02:51:06 +0000 (19:51 -0700)]
RM-620 build: New tag 2.14.0-ddn129

New tag 2.14.0-ddn129

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I9310e8bddd0fd14b5c8c1faa109bdab19454eca1

17 months agoLU-17354 osp: don't reset sequence client
Alex Zhuravlev [Tue, 12 Dec 2023 08:57:53 +0000 (11:57 +0300)]
LU-17354 osp: don't reset sequence client

do not reset sequence client if sequence allocation returned an
error, instead try to to get sequence later upon reconnection.

Lustre-change: https://review.whamcloud.com/53406
Lustre-commit: TBD (from 5cce95b35c652564b084f0721d4775d0fd522aa7)

Fixes: 6c4c51e307 ("LU-1445 osp: Use FID to track precreate cache.")
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Ie23b688e4f93651c4615d77a9686c44a150d3961
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53417
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
17 months agoLU-17365 lod: handle llog errors gracefuly
Mikhail Pershin [Wed, 13 Dec 2023 12:43:53 +0000 (15:43 +0300)]
LU-17365 lod: handle llog errors gracefuly

Distinguish remote llog errors by their source and type
in LOD lod_sub_prep_llog() and uniform errors reported
by llog_osd_read_header() and llog_init_handle.

- Partial llog header or 0-size llog is to be
  reinitialized, new header is created
- in llog_read_header() dt_attr_get() and read_header()
  thier errors are printed and returned as -EIO to caller
- llog with invalid llog header data is skipped and new
  one is created to be used instead. To indicate that
  the llog_init_handle() returns -EINVAL error code instead
  of -EIO. Therefore network errors are to be handled by
  lod_sub_recovery_thread() retry logic while bad llog
  content will lead to immediate llog re-creation.
- lod_sub_init_llogs() tries to init all targets even
  if some failed
- always recreate llogs after recovery abort no matter
  if ctxt->loc_handle exists or not

Patch tries to cover known issues and types of error during
update log recovery and provides also better debug for
similar cases in future

Lustre-change: https://review.whamcloud.com/53510
Lustre-commit: e81805244476f1d3ffb5a2ecb0a85f54b936ce51

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I2705e0dc245ed4365123ce47137193a9ed769673
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53630
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
17 months agoLU-10283 mdd: fix parent FID in changelog of striped directory
Dmitry Ivanov [Mon, 16 May 2022 18:15:19 +0000 (12:15 -0600)]
LU-10283 mdd: fix parent FID in changelog of striped directory

Changelog entry for the file operations such as create, rename,
link, unlink, mkdir referred to parent FID ("p=") as a shard's
FID in a striped directory. The same was true for the source's
parent FID ("sp="). This commit hides the Lustre intrinsics from
user displaying the parent's directory FID instead as expected.

An object might be in a remote MDT, in which case obtaining the parent
FID via the linkEA can be an expensive operation, so the parent FID is
cached in the mdd_object, so that the cost of the cross-MDT RPC is
amortized over the lifetime of the object.

Certain userspace tools might depend on the previous behavior of
displaying the shard's parent FID in the changelog records, so this
canp be enabled by setting mdd.*.enable_shard_pfid=1, if this is
required for compatibility.

Lustre-change: https://review.whamcloud.com/51322
Lustre-commit: 3554923af9e3260235865d90949ecd2924bbbc0e

HPE-bug-id: LUS-10721
Signed-off-by: Dmitry Ivanov <dmitry.ivanov2@hpe.com>
Change-Id: Iae15b49f5852f36ba62ae1706d3a5f4ebf307bc4
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53475
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexandre Ioffe <aioffe@ddn.com>
Reviewed-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
17 months agoLU-16498 obdclass: change uc_lock to rwlock
Sebastien Buisson [Thu, 14 Sep 2023 16:00:04 +0000 (18:00 +0200)]
LU-16498 obdclass: change uc_lock to rwlock

Change the upcall cache uc_lock to a read-write lock so that threads
can get the read lock to do concurrent lookups in the upcall cache,
and only grab the write lock in the rare case when a new entry is
added or old entries are expired. That reduces serialization between
server threads during normal operation, and avoids all of the threads
spinning for some time if the requested key (UID or gss context) is
not in the cache at all, before they sleep.

Lustre-change: https://review.whamcloud.com/52395
Lustre-commit: TBD (from 003615a0a6711334d95c42f3c41852e1cbc8e77b)

Test-Parameters: kerberos=true testlist=sanity-krb5
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I812400104fd2115d19386fb4a03bb3ce99c49383
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53622
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
17 months agoLU-17374 gss: get rid of rsi cache entries after req handle
Sebastien Buisson [Mon, 18 Dec 2023 13:59:30 +0000 (14:59 +0100)]
LU-17374 gss: get rid of rsi cache entries after req handle

RPCSEC init requests are kept in the rsi cache. While this is useful
during request processing involving upcall/downcall with userspace,
rsi entries are never used again once RPCSEC init requests have been
handled completely.
And keeping entries in the rsi cache has some impact on authentication
speed. When a new RPCSEC init request is received, the first step is
to check if there is a valid matching entry in the cache. It is never
the case, except if an authentication request is replayed, but GSS
rejects that anyway.
So we spend time browsing a cache from which we expect no match. Even
if the upcall cache mechanism takes this lookup opportunity to remove
invalid or expired entries, it is even better to remove cache entries
as soon as we know they are done.

Lustre-change: https://review.whamcloud.com/53488
Lustre-commit: 7a56a689d4aa588bd003e35fdb93d87cf1e56d1d

Test-Parameters: kerberos=true testlist=sanity-krb5
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Ia9946578c3d3149e6235d832df28214ae8984f1e
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53610
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
17 months agoEX-7849 quota: notify newest lqe in qmt_set_id_notify
Sergey Cheremencev [Fri, 15 Dec 2023 18:57:02 +0000 (21:57 +0300)]
EX-7849 quota: notify newest lqe in qmt_set_id_notify

It is possible that lqe_locate may call lqe_find inside
qmt_pool_lqes_lookup_spec and insert the 2nd lqe into
lqs_hash during processing the previous one. Do not add the
1st lqe to be processed by qmt_reba_thread in qmt_id_lock_notify,
as this lqe will be freed in the end of lqe_locate_find due
to the race with the 2nd that is already exist in lqs_hash.
This fix should potentially fix the following assertion:

  (qmt_lock.c:950:qmt_id_lock_glimpse()) ASSERTION( lqe->lqe_gl ) failed:
  (qmt_lock.c:950:qmt_id_lock_glimpse()) LBUG

Lustre-change: https://review.whamcloud.com/53637
Lustre-commit: 2832874970232fb5e1deedbf89b7a482518e6886

Test-Parameters: trivial testlist=sanity-quota,racer
Test-Parameters: trivial testlist=sanity-quota,racer
Test-Parameters: trivial testlist=sanity-quota,racer
Test-Parameters: trivial testlist=sanity-quota,racer
Test-Parameters: trivial testlist=sanity-quota,racer
Test-Parameters: trivial testlist=sanity-quota,racer
Test-Parameters: trivial testlist=sanity-quota,racer
Test-Parameters: trivial testlist=sanity-quota,racer
Fixes: 09f9fb3211 ("LU-11023 quota: quota pools for OSTs")
Signed-off-by: Sergey Cheremencev <scherementsev@ddn.com>
Change-Id: I179edb06ec8c784636f566ffeba0035c6758a55b
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53496
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
17 months agoEX-7795 tests: add sanity-compr test for compression
Jian Yu [Tue, 9 Jan 2024 07:09:10 +0000 (23:09 -0800)]
EX-7795 tests: add sanity-compr test for compression

This patch adds a sanity-compr test to validate that
we get space usage reduction with compression.
The test uses ll_compression_scan tool to calculate
the compressed size of the source file and compares
it with the size of the Lustre CSDC compressed file.

Test-Parameters: trivial env=ONLY=1007 testlist=sanity-compr

Change-Id: Icf763331205a3e937b794f90444f756fc59f9050
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52895
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sarah Liu <sarah@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
17 months agoEX-7601 ldiskfs: fix detection of compressed extent
Patrick Farrell [Wed, 3 Jan 2024 22:23:20 +0000 (17:23 -0500)]
EX-7601 ldiskfs: fix detection of compressed extent

The code in ldiskfs_map_inode_pages which detects a
compressed extent checks the first lnb for that extent, but
it's possible for some lnbs and not others to be compressed
in a given extent, so we must check all of them.  This
occurs when multiple writes have been combined in to one
RPC.

If we don't detect compression correctly, we won't set the
file size correctly and we'll get data corruption.

Fixes: b489a2a397 ("EX-7600 osc: save compressed object size")
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I11d50bdc45c40d93bb1b829fcd930165b7626432
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53588
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
17 months agoEX-7601 osc: add COMPR_GAP check to compress_request
Patrick Farrell [Wed, 3 Jan 2024 21:39:31 +0000 (16:39 -0500)]
EX-7601 osc: add COMPR_GAP check to compress_request

Currently, compress_request will build the compression
buffer (calling merge_chunk()) for requests which are less
than the minimum compression gap.  This is noticed in the
compression code when it checks if there's enough data to
attempt compression, but we can do a trivial check in
compress_request() to save that work.

Also fix a few minor style things.

This is not an important fix, but I discovered it while
investigating another issue and it's trivial to resolve.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ieb32e6297e10d229f23c58e2ef4d933ce3dda4f2
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53587
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
17 months agoEX-6269 csdc: tune allowed compression type on server
Bobi Jam [Thu, 7 Sep 2023 09:29:42 +0000 (17:29 +0800)]
EX-6269 csdc: tune allowed compression type on server

Use lctl get_param {mdt|obdfiler}.*.compress_types to list supported
compression types on server.

Use lctl set_param {mdt|obdfilter}.*.compress_types="+gzip-lzo" to
add gzip to and delete lzo from existing compression types on server.

Server will negotiate supported compression types with client in
ocd_compress_types and client import stats could show the negotiated
supported compression types in "lctl get_param {mdc|odc}.*.import"
connect_data section:

import:
  ....
  connect_data:
     ....
     compress_types: [ fast,best,gzip,lz4fast,lz4hc,lzo ]

The OST support for the connect flags is enabled in this patch, but
MDT enabling is pending unaligned compression support for DoM files.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I18943352e25ed9d5fe82442df9f00a7ef388f242
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52307
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
17 months agoEX-6269 osc: handle different compression types
Andreas Dilger [Fri, 5 Jan 2024 23:26:05 +0000 (23:26 +0000)]
EX-6269 osc: handle different compression types

Allow the client to handle different compression types in a single
component.  This shouldn't happen normally, but it may happen in
the future if there is dynamic compression algorithm selection for
"fast" or "best" types (e.g. compress based on available CPU and
network bandwidth or RPC backlog).

Change-Id: Ide2731c60a68584e7cbb474bee88a17e9a7b8fec
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53602
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
17 months agoEX-6269 osc: decompress with algorithm from server
Patrick Farrell [Fri, 5 Jan 2024 20:24:08 +0000 (15:24 -0500)]
EX-6269 osc: decompress with algorithm from server

Data may not be compressed with the compression type and
level from the layout, so we must use the compression type
and level from storage for decompression.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ib4cdccf294ef631a25147413d7f5c1a847c9504e
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53601
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
17 months agoEX-6269 obd: add 'lvl' for best and fast
Patrick Farrell [Tue, 9 Jan 2024 23:20:05 +0000 (18:20 -0500)]
EX-6269 obd: add 'lvl' for best and fast

'best' and 'fast' compression types must also set a level,
because not all levels are supported by all algorithms.

Rather than trying to be clever, just use simple universally
supported values, except for lz4fast, where we special case
this, because otherwise '0' is the slowest setting (and
lz4fast is likely to remain our default fastest).

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I7c29659d4f027af2e44285ae38e4c9d91e35509a
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53600
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
17 months agoEX-8739 tests: skip sanity-pcc tests on el9.3
Andreas Dilger [Tue, 9 Jan 2024 04:06:50 +0000 (21:06 -0700)]
EX-8739 tests: skip sanity-pcc tests on el9.3

Skip sanity-pcc test_6, test_7a/7b, test_23, test_35 on RHEL9.3
clients due to continuous failures with PCC-RW, which is unused.

Skip sanity-pcc test_102 due to el9.3 fio io_uring bug.

Test-Parameters: trivial testlist=sanity-pcc clientdistro=el9.3
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I76cbd0342788fff8b0167c0656e941f96d73fc48
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53618
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
17 months agoLU-12998 tests: fix conf-sanity/112a version check
Andreas Dilger [Mon, 8 Jan 2024 06:26:02 +0000 (23:26 -0700)]
LU-12998 tests: fix conf-sanity/112a version check

Fix version number in conf-sanity test_112a version due to
skew before landing the patch.

Fixes: b2be94f559 ("LU-12998 mds: add no_create parameter to stop creates")
Test-Parameters: trivial testlist=conf-sanity env=ONLY=112 serverversion=EXA6
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I080de5421b918cf5e0d692740fb37b514a6c1014
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53607
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
17 months agoRM-620 build: New tag 2.14.0-ddn128
Andreas Dilger [Sat, 6 Jan 2024 08:22:54 +0000 (01:22 -0700)]
RM-620 build: New tag 2.14.0-ddn128

New tag 2.14.0-ddn128

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ie84bfc97c732043030769c183a7e8a879bb3e0f1

17 months agoLU-17289 test: fix sanity/906 version check
Andreas Dilger [Thu, 4 Jan 2024 00:07:34 +0000 (00:07 +0000)]
LU-17289 test: fix sanity/906 version check

Fix the version check in test_906 to include RHEL9.3.0.

Change-Id: I7e066cdd16946b541fee96281dd5a5c90daa7072
Fixes: a6739c9c9a ("LU-17289 test: disable sanity/test_906 temporarily")
Test-Parameters: trivial testlist=sanity clientdistro=el9.3
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53580
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
17 months agoLU-17352 utils: lljobstat can read dumped stats files
Lei Feng [Sun, 10 Dec 2023 08:45:38 +0000 (16:45 +0800)]
LU-17352 utils: lljobstat can read dumped stats files

Improve lljobstat command to read dumped stats file.
Usually the file is generated by command:
  lctl get_param *.*.job_stats > all_job_stats.txt

Multiple files can be specified with multiple --statsfile
options. For example:
  lljobstat --statsfile=1.txt --statsfile=2.txt

Stats data from multiple files will be added up and
sorted. Then the top jobs will be listed.

Try to use CLoader to accelerate the YAML parsing.

Handle SIGINT and exit silently if lljobstat is in the loop
of reading system job_stats files periodically.

Fix a bug when the job_id is a pure number.

Lustre-change: https://review.whamcloud.com/53397
Lustre-commit: ef2555d7af21bd35756805b13e6b458f56cecf54

Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: trivial
Change-Id: Iee1ce69d2befb9d021e34effd4fc65a47297c1fb
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53582
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
17 months agoLU-17048 mdd: protect layout change in MDD layer
Bobi Jam [Mon, 28 Aug 2023 13:08:34 +0000 (21:08 +0800)]
LU-17048 mdd: protect layout change in MDD layer

We need to detect changes to the LOD layout in between transaction
declaration and when the objects are locked during transaction
execution. Otherwise, if another thread has modified the layout
of an object used by the transaction then the declaration may
be incorrect.

This patch save objects' layout generation in transaction delaration
phase, and check whether they have been changed by others in the
transaction execution phase, if that's the case, the transaction will
be retried for several times.

Lustre-change: https://review.whamcloud.com/52146
Lustre-commit: d5ab62af24166529b84b4d7227b96d3a69989a95

Fixes: b7bd4e3422 ("LU-14621 mdd: fix lock-tx order in mdd_xattr_merge()")
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I25fe03c6e8fc4eebccc039e62dfc88db1179cb26
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53567
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
17 months agoEX-7601 osc: debug fix in decompress_request
Patrick Farrell [Thu, 4 Jan 2024 18:56:29 +0000 (13:56 -0500)]
EX-7601 osc: debug fix in decompress_request

Debug message had an incorrect subtraction.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I5daf360766ca77b98dc5af3d72c42ac38f5782bc
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53586
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
17 months agoEX-7601 tests: add mmap write test
Patrick Farrell [Fri, 29 Dec 2023 20:10:58 +0000 (15:10 -0500)]
EX-7601 tests: add mmap write test

This improves the existing mmap test to test mmap writing
as well as mmap reading.

Test-Parameters: trivial
Test-Parameters: testlist=sanity-compr env=ONLY="1003",ONLY_REPEAT=10
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I81840c7bbbefbc5c3bae6b270c2d94297a254d19
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53307
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
17 months agoEX-7601 tests: add multi-mount compression test
Patrick Farrell [Fri, 29 Dec 2023 20:10:36 +0000 (15:10 -0500)]
EX-7601 tests: add multi-mount compression test

This adds a multi-mount correctness test for compression.
This races IO from two mountpoints at varying sizes to
stress test compression.

Test-Parameters: trivial
Test-Parameters: testlist=sanity-compr env=ONLY=1006
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: If49cbd6d171068faa802835146f273d835b39bc3
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51842
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
17 months agoEX-7601 tests: tests for read-modify-write
Patrick Farrell [Fri, 29 Dec 2023 20:09:49 +0000 (15:09 -0500)]
EX-7601 tests: tests for read-modify-write

This patch adds tests for the read-modify-write case for
EX-7601.  There's still some additional tests to be added
here, but this is a good start.

Test-Parameters: trivial
Test-Parameters: testlist=sanity-compr env=ONLY="1004 1005",ONLY_REPEAT=10
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I5dd9e566b8274ece99283c8962e0d34225089cc0
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53230
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
17 months agoEX-7601 osc: add check to decompress_request
Patrick Farrell [Tue, 19 Dec 2023 04:19:44 +0000 (23:19 -0500)]
EX-7601 osc: add check to decompress_request

decompress_request should check to see if there's room in
the RPC for the decompressed data, since this can occur if
there's a bug or data corruption, and otherwise we will
go past the end of the RPC during decompression.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ib1bf19bf39701b72f0f5a61b2aaff2f2fdad1897
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53502
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
17 months agoEX-7601 ofd: add checks to io_lnb_to_tx_lnb
Patrick Farrell [Wed, 13 Dec 2023 23:33:16 +0000 (18:33 -0500)]
EX-7601 ofd: add checks to io_lnb_to_tx_lnb

We should always be able to find the remote niobuf in the
local io range, if we can't, there's a bug.  So assert on
this.

We should also never have page level overlapping remote IOs,
at least until we have unaligned DIO.  (We can remove this
check when we combine the features.)

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I325d4a37d25c116e42621964e90b225b71fd8f1f
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53450
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
17 months agoEX-7601 ofd: add past eof check for reads
Patrick Farrell [Tue, 12 Dec 2023 17:35:25 +0000 (12:35 -0500)]
EX-7601 ofd: add past eof check for reads

The client does not normally generate reads past EOF, but
this can occur during some racing situations.  We need to
check for that case and not attempt decompression, since
there's no data to decompress if we're reading past EOF.

This covers a failure which shows up occasionally in the
racing parts of the test suite, but it's challenging to
write an explicit test for this.

We also add handling for complete reads of the last chunk,
even if that chunk is partial, because we can send that to
the client for decompression.

This allows us to remove the slightly funky eof handling
in decompress_rnb, since we'll just not call that code in
this case now.  Note we'll still call decompress_rnb, etc,
for writes if they start before EOF and finish after EOF
(and are unaligned).  This is fine - this case should be
rare and if we hit it, we'll notice there's nothing to
decompress and proceed accordingly.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I50295f2803af611de5069d094c0a5d1b0a4a9c2d
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53428
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
17 months agoEX-7601 ofd: put decompress_read in read prep
Patrick Farrell [Tue, 12 Dec 2023 17:29:22 +0000 (12:29 -0500)]
EX-7601 ofd: put decompress_read in read prep

ofd_decompress_read is called from ofd_write_prep for
writes, but from tgt_brw_read for reads.  This makes the
code a little harder to follow and makes it difficult to
check read side decompression against EOF.

Instead, we move the decompression call to ofd_preprw_read.
This makes no change to the real operations here, but makes
for better code (and more similar code between read and
write).

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ibefd0a48ad08e83725f2df64618db60ba61c5ce0
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53427
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
17 months agoEX-7601 ofd: same-ify preprw_read and preprw_write
Patrick Farrell [Tue, 12 Dec 2023 17:07:35 +0000 (12:07 -0500)]
EX-7601 ofd: same-ify preprw_read and preprw_write

preprw_read and preprw_write have some sections which are
functionally the same but which have diverged slightly.
(These can't easily be shared between the functions.)

This is a short patch to make them more similar before
adding eof checking to reads.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I7bce912e99e61a4eec4060d6b49d4917894b44c4
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53426
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
17 months agoEX-7601 ofd: don't read for writes past eof
Patrick Farrell [Tue, 12 Dec 2023 16:37:48 +0000 (11:37 -0500)]
EX-7601 ofd: don't read for writes past eof

There's no data past EOF, so there's no need to do
read-modify-writes when the entire write is past the chunk
at EOF.  So in that case, don't read up data and don't
attempt decompression.

There's no explicit test for this, but this shows up
immediately in the random-offset copy tests, because they
seek and write various sizes to offsets past current EOF.

We also need this functionality for reads, because in some
cases the client will do reads past EOF (this is unusual,
but can still happen sometimes).  This is added in a
separate patch because it requires some code reorganization.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ia2b598165d5645c5a44c3d58bea69c7e42f10e41
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53425
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
17 months agoEX-7601 ofd: multiple reads in same chunk
Patrick Farrell [Tue, 12 Dec 2023 04:35:24 +0000 (23:35 -0500)]
EX-7601 ofd: multiple reads in same chunk

When doing DIO or if we get unusual cache behavior on the
client, multiple reads can hit the same chunk.

This only shows up in racing tests, but it's important to
handle.  We do this by making sure we start searching the
lnbs for decompression at the start of the last chunk we
decompressed.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I81fbbba79b16066e6d4519c66030cc58e03d2de7
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53419
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
17 months agoRM-620 build: New tag 2.14.0-ddn127
Andreas Dilger [Tue, 2 Jan 2024 08:26:11 +0000 (01:26 -0700)]
RM-620 build: New tag 2.14.0-ddn127

New tag 2.14.0-ddn127

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I38dd7ae99d4594896d14224de68d6b42e83fde10

17 months agoEX-7601 tgt: objcount in RPC must be 1
Patrick Farrell [Fri, 29 Dec 2023 19:49:55 +0000 (14:49 -0500)]
EX-7601 tgt: objcount in RPC must be 1

Much of the BRW write code assumes objcount is one, but
there is some provision for multiple objects.

Since the code will break if we send it multiple objects,
add errors to make sure anyone changing it will notice.

This isn't strictly compression related, but compression
adds even more code which assumes this, so this protection
will be useful.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Idcbf33fd14d4b1bd179c9516bed07cca907008bc
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52990
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
17 months agoEX-8826 ofd: set compressed file size for fake writes
Patrick Farrell [Sat, 16 Dec 2023 22:40:16 +0000 (17:40 -0500)]
EX-8826 ofd: set compressed file size for fake writes

When using the fake writes fail_loc, file size setting is
done at the ofd layer, since the osd layer isn't used.  So
we must also handle the compressed file size for this case.

This fixes sanity test 399a with compression.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Icda612405908166d043e1e568d0d8bd9cd0c5156
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53483
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
17 months agoEX-7601 ofd: minor debug improvements
Patrick Farrell [Mon, 11 Dec 2023 23:15:44 +0000 (18:15 -0500)]
EX-7601 ofd: minor debug improvements

A smattering of minor debug improvements across several
patches, placed at the end because they're all minor and
some of them would disturb early parts of the series.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I2071911eb09f5c7fad28203db05396bb31ccda59
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53418
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
17 months agoEX-7601 osd: add assert for prepare partial page
Patrick Farrell [Mon, 20 Nov 2023 00:40:27 +0000 (19:40 -0500)]
EX-7601 osd: add assert for prepare partial page

In the write prep code, we read up any partial pages (pages
which are not completely overwritten by the write) to
prepare them for write.  But for compressed files, we will
have already done this to prepare for decompression.

Add an assert to make sure we catch if this is ever wrong.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I1366b1f5b191a4d581448d692933d562198c3a1f
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53179
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
17 months agoEX-7601 ofd: create read mapping for read-modify-write
Patrick Farrell [Sun, 5 Nov 2023 15:55:29 +0000 (10:55 -0500)]
EX-7601 ofd: create read mapping for read-modify-write

When we need to do a read-modify-write for unaligned writes
to a compressed file, it's important we read only the
portion of the file which is receiving unaligned IO.

This patch identifies these chunks in preprw_write and
creates a read lnb mapping from a subset of the pages for
write.  These pages we read up are then decompressed.

Note one issue this patch does not address is reading of
data past EOF.  If the final chunk is unaligned, we will
round the write to cover it.  This results in extending the
file inappropriately, writing zeroes where they aren't
needed.  The read side gives us the info to address this,
which we will do in a future patch.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Iede43f12127cbb93e73c22a915192aa2f814a927
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52997
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
17 months agoEX-7601 ofd: distinguish nr_write and nr_read
Patrick Farrell [Fri, 3 Nov 2023 20:29:51 +0000 (16:29 -0400)]
EX-7601 ofd: distinguish nr_write and nr_read

We will have two counts of pages in lnbs, distinguish
between them.

Not actually used yet - will be calculated when the read
lnb mapping is created.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I709b8fd299163d348a196184152bb0294fcb650b
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52985
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
17 months agoEX-7601 ofd: add read lnb to ofd_preprw_write
Patrick Farrell [Fri, 3 Nov 2023 20:22:22 +0000 (16:22 -0400)]
EX-7601 ofd: add read lnb to ofd_preprw_write

The read phase of read-modify-write for compressed files
needs to read only a subset of the pages which will be
written, so it needs a separate set of lnb pointers for
tracking this subset.

This patch passes around the necessary argument but does
not set up or use the lnb yet.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I7ec7101e65e73a6c9e67cea3c58d8cace38e70e0
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52984
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
17 months agoLU-17370 utils: simplify lfs help text
Alexandre Ioffe [Thu, 21 Dec 2023 06:53:42 +0000 (22:53 -0800)]
LU-17370 utils: simplify lfs help text

Simplify help text for lfs getstripe and lfs setstripe.
Update corresponding man pages lfs-getstripe and lfs-setstripe.
On man pages make left side adjustment and disable hyphenation:
'.nh', '.ad l' to prevent hyphenation of keywords

Lustre-change: https://review.whamcloud.com/53564
Lustre-commit: TBD (from 6c3dae58eddc2e3c7caf35599733b2e59ebeb657)

Signed-off-by: Alexandre Ioffe <aioffe@ddn.com>
Test-Parameters: trivial
Change-Id: Iae9d3534230ee7d325fbeffd78b5c12632a4a161
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53523
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
17 months agoLU-17186 utils: replace gethostby*() with get*info()
Sebastien Buisson [Fri, 8 Dec 2023 19:53:12 +0000 (11:53 -0800)]
LU-17186 utils: replace gethostby*() with get*info()

This patch replaces the deprecated gethostbyname() and
gethostbyaddr() functions with getaddrinfo() and getnameinfo()
functions respectively.

The getaddrinfo() function combines the functionality provided by the
gethostbyname() and getservbyname() functions into a single interface,
but unlike the latter functions, getaddrinfo() is reentrant and allows
programs to eliminate IPv4-versus-IPv6 dependencies.

The getnameinfo() function is the inverse of getaddrinfo(): it
converts a socket address to a corresponding host and service, in a
protocol-independent manner. It combines the functionality of
gethostbyaddr() and getservbyport(), but unlike those functions,
getnameinfo() is reentrant and allows programs to eliminate
IPv4-versus-IPv6 dependencies.

Lustre-change: https://review.whamcloud.com/52632
Lustre-commit: TBD (from 99687573d33336a153c1a5b94a4b66ebbcc2d0f1)

Test-Parameters: kerberos=true testlist=sanity-krb5
Test-Parameters: testgroup=review-dne-selinux-ssk-part-2
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Iacb5583826cd2f7329455bc6cbb4477f9087f15a
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53386
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
17 months agoLU-17261 lov: ignore broken components
Alex Zhuravlev [Sun, 5 Nov 2023 13:51:29 +0000 (16:51 +0300)]
LU-17261 lov: ignore broken components

if some component of a mirrored file is broken, it makes sense
to try another (possible valid) replica rather than give up
immediately.

Lustre-change: https://review.whamcloud.com/52996
Lustre-commit: 902fe290e51dccdee89380fb725ae6e3c1802e2b

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I32ea0efa90109f5159bf8b6c4e0efe1d543580c3
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Zhenyu Xu <bobijam@hotmail.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53542
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
17 months agoRM-620 build: New tag 2.14.0-ddn126
Andreas Dilger [Fri, 29 Dec 2023 11:20:01 +0000 (04:20 -0700)]
RM-620 build: New tag 2.14.0-ddn126

New tag 2.14.0-ddn126

Signed-off-by: Andreas Dilger <adilger@dilger.ca>
Change-Id: I67599c5b0918be8761e0561cc9c60e39f171196f

17 months agoLU-16859 lnet: incorrect check for duplicate NI
Serguei Smirnov [Tue, 31 Oct 2023 21:11:54 +0000 (14:11 -0700)]
LU-16859 lnet: incorrect check for duplicate NI

When NI is being added to an existing LNet, checking against
existing NI interface names currently fails if the new NI
happens to use interface name which is a prefix of one used
by an existing NI.

The following example assumes ib0 and its alias ib0:1 are
configured:

lnetctl net add --net o2ib --if ib0:1
lnetctl net add --net o2ib --if ib0

Fix this by making sure interface strings are compared properly
regardless of relative length.

Lustre-change: https://review.whamcloud.com/52918
Lustre-commit: 7dcdb9eb0ded98e956fe417abbd835433a8de3f0

Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: I0d4047118e7d9982fa791a2e324a27aa5d4abaee
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53527
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
17 months agoLU-16552 test: add new lnet test for Multi-Rail setups
James Simmons [Sun, 13 Aug 2023 15:02:33 +0000 (11:02 -0400)]
LU-16552 test: add new lnet test for Multi-Rail setups

You can crash lnet kernel module by setting up a interface with
lctl net up and then attempting to setup the interface with
the import function. This is due to improper clearing the net_cpts
array.

Currently sanity-lnet.sh doesn't real test MR setups. Because of
this a few bugs slipped in. Add two new test to ensure MR setups
behave properly. Test 107 is to see if deleting a second interface
for a MR setup doesn't crash a node. Test 108 creates a multi rail
setup of a tcp LNet net with two interfaces, one real and the
other fake. A bug was preventing the second fake interface from
being added.

Lustre-change: https://review.whamcloud.com/50302
Lustre-commit: 8785f25b053c69b4303e901c6c8dc5d0d4d6dfc1

Test-Parameters: trivial testlist=sanity-lnet
Change-Id: Ic69e14bd0617f4d6fe931140b5b6d43b795843cf
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53529
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
17 months agoLU-12031 mdt: explicit data version of DoM files
Mikhail Pershin [Mon, 25 Apr 2022 06:13:53 +0000 (09:13 +0300)]
LU-12031 mdt: explicit data version of DoM files

Use EA to store 'data_version' for DoM files explicitly.

Unlike OST objects the 'inode_version' of DoM file is changed
by metadata operations as well and that leads to problems
during HSM operations, e.g. writing HSM EA with file data
version inside causes DoM object version update making this
HSM EA version obsoleted, also any metadata update on
restored file makes it dirty and prevents second release.

DoM files have now explicitly updated 'data_version' in
addition to ordinary 'inode_version'. The 'data_version'
is updated along with 'inode_version' upon write/truncate and
fallocate operations and is stored as 'trusted.dataver' EA.
Layout swap procedure is updated to move data version between
files being swept along with HSM attributes.
If DoM file is migrated to RAID0 file then 'dataver' EA is
deleted.

Corresponding test 1f is added to sanity-hsm.sh and
207j to sanity.sh.

Lustre-change: https://review.whamcloud.com/47139
Lustre-commit: aae3289adb2bbc192870f195b78044484f717e16

Test-Parameters: clientversion=2.12.4 testlist=sanity-hsm
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I4689c56394c7323d32cd6f7dd86f58beb6e53353
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sergey Cheremencev <scherementsev@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53214
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
17 months agoLU-12998 mds: add no_create parameter to stop creates
Andreas Dilger [Sat, 23 Apr 2022 00:10:36 +0000 (18:10 -0600)]
LU-12998 mds: add no_create parameter to stop creates

Add an target tunable parameter and mount option "no_create" to
disable new *directory* creation on an MDT.  This sends the
flag OS_STATFS_NOCREATE to the clients, and the DNE MDT space
balance will avoid selecting that MDT when creating a new
subdirectory, without disabling access to existing files/dirs.

This allows "soft disabling" an MDT in advance of storage
upgrades to minimize new directories and files created on that
MDT, reduce future migration, and/or backup/restore workload.

As yet it does not totally disable *file* creation on the MDT,
but it may be extended to do so in the future.

This is analogous to the "no_precreate" option that was added
on the OSTs, and "no_create" has been added to the OSTs for
consistency ("no_precreate" is kept for compatibility for now).

lod_declare_create() checks whether directory create target MDT is
current MDT, this may happen if nocreate is set on some MDT. Upon
such mismatch, call dt_statfs() to fetch latest statfs to know
whether nocreate is set.

lmv_create() will choose another MDT if target MDT is set with
nocreate, but in case the flag is cleared, call obd_statfs() to fetch
cached statfs and check again.

Lustre-change: https://review.whamcloud.com/47124
Lustre-commit: 1dbcd0bab881fac38d8a5e4ef1559f12618f8f0e
Lustre-change: https://review.whamcloud.com/53437
Lustre-commit: 066262a04cb8e0cbf49a20b7bf036d4484399afe (TBD)

Test-Parameters: testlist=conf-sanity env=ONLY=112b,ONLY_REPEAT=50
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I53cfb48ade2f844b18bfc630e7fcea6de9ce7057
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53189
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
17 months agoLU-17263 utils: 'lfs find -blocks' to use 512-byte units
Andreas Dilger [Sun, 5 Nov 2023 05:32:19 +0000 (23:32 -0600)]
LU-17263 utils: 'lfs find -blocks' to use 512-byte units

Change the default units for 'lfs find -blocks' from 1KiB blocks
to 512-byte blocks to better match the behavior of find(1).  This
also matches what "-printf %b" will print.

Change llapi_parse_size() to accept a 'c' argument to specify
characters, and accept a "B" or "iB" suffix if provided.

Lustre-change: https://review.whamcloud.com/52993
Lustre-commit: 869ea3211d2f15d7c674bc10e5f1a3272e44504e

Fixes: c043f46025 ("LU-10705 utils: add "lfs find --blocks"")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: If8345f15bf53912501cadc0fa7f981a9f787b767
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53522
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
17 months agoLU-17349 tests: sanity-quota_81 decrease timeout
Sergey Cheremencev [Sun, 3 Dec 2023 04:06:23 +0000 (07:06 +0300)]
LU-17349 tests: sanity-quota_81 decrease timeout

Decrease cfs fail timeout in sanity-quota_81 from 30
to 10 seconds to avoid soft lockup.

Lustre-change: https://review.whamcloud.com/53384
Lustre-commit: b58219ef1edebcb266cbe0dfede491ba5de491d1

Fixes: 862f0baa7c21 ("LU-15097 quota: stop pool_recalc before killing pool")
Test-Parameters: trivial testlist=sanity-quota
Signed-off-by: Sergey Cheremencev <scherementsev@ddn.com>
Change-Id: I8630db7b3948b335fef5d5349f960f79cb877fc3
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53516
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
17 months agoLU-17358 lprocfs: make job_stats job_id valid yaml
Nathaniel Clark [Tue, 12 Dec 2023 18:05:22 +0000 (13:05 -0500)]
LU-17358 lprocfs: make job_stats job_id valid yaml

Fix quoting job_id to account for leading '@' being reserved.

Test-Parameters: trivial
Signed-off-by: Nathaniel Clark <nclark@whamcloud.com>
Change-Id: Ifce3edc9b636db2f059ab9960488972a152d2e7a
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53424
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Feng Lei <flei@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53519

17 months agoEX-8780 test: wait osts up after restart
Hongchao Zhang [Sat, 9 Dec 2023 20:43:49 +0000 (04:43 +0800)]
EX-8780 test: wait osts up after restart

In test_18e of sanity-lfsck, the OSTs could not be ready on all MDTs
and the LFSCK status will be incorrect because the LFSCK notify can
not be sent to all OSTs.

Change-Id: If1ed5d920d5c8b99d42f59f92a1e245a9e2a8267
Test-Parameters: trivial testlist=sanity-lfsck,sanity-lfsck,sanity-lfsck,sanity-lfsck,sanity-lfsck
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53531
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
17 months agoLU-17289 test: disable sanity/test_906 temporarily
Qian Yingjin [Thu, 7 Dec 2023 09:45:01 +0000 (04:45 -0500)]
LU-17289 test: disable sanity/test_906 temporarily

On the rhel9.3, the fio io_uring engine testing failed with error
"Operation not permitted" on both local file systems (Ext4 and
xfs) and Lustre:

    "fio: pid=4551, err=1/file:engines/io_uring.c:1047,
    func=io_queue_init, error=Operation not permitted"

This is a generic failure in RHEL9.3.  Thus we disable
sanity/test_906 temporarily until the bug is fixed in RHEL9.3.

Lustre-change: https://review.whamcloud.com/53362
Lustre-commit: TBD (from 0eef4b0818e7a1a42a54333fa713ef660c7e9404)

Test-Parameters: trivial
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I3805b475c5f3d0b62dc6c57c4cd93f2bc1b67b76
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53546
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
17 months agoLU-17000 gss: Fix Out-of-bounds access under svcgssd_proc.c
Arshad Hussain [Wed, 1 Nov 2023 06:50:53 +0000 (12:20 +0530)]
LU-17000 gss: Fix Out-of-bounds access under svcgssd_proc.c

Problem reported by coverity was passing 32bit type and
then dereferencing to larger 64bit under function
handle_channel_request(). This patch address this issue.

Since this is an uapi and to catch corner cases like
kernel modules being updated separately from user tools
RSI_DOWNCALL_MAGIC is also changed from 0x6d6dd62a to
0x6d6dd63a.

This patch also changes 32bit member (sid_hash) of
'struct rsi_downcall_data' to 64bit. Which also requires
changing of wiretest.c and wirecheck.c

Lustre-change: https://review.whamcloud.com/52920
Lustre-commit: 7d764f1f11be144ad26e33aa8cecedc5bb708793

CoverityID: 404758 ("Out-of-bounds access")
Fixes: 4daf43ac3c ("LU-17015 gss: support large kerberos token for rpc sec init")
Test-Parameters: kerberos=true testlist=sanity-krb5
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I8041cd4063f1b1cefdebf5681df426be61820f99
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53440
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
17 months agoLU-14918 osd: don't declare similar ldiskfs writes twice
Alex Zhuravlev [Tue, 7 Dec 2021 08:13:54 +0000 (11:13 +0300)]
LU-14918 osd: don't declare similar ldiskfs writes twice

in some cases (like overstriping) the same operations can be
declared multiple times (new llog records) and this lead to
huge number of credits and performance degradation. we can
avoid this checking for duplicate declarations.
As every declaration would need an allocation, limit the scope
of this checks to transaction likely to be large.

% of "large" transaction in sanity-benchmark, depending on threshold:

  creates < 5 && writes < 5:
  0.58% (mds1) and  2.97% (mds2)

  create < 7 & writes < 7:
  0.58% and 2.4%

  create < 9 & writes  < 9:
  0.6% and 1.85%

  create < 10 & write2 < 10:
  0.0004% and 0.000001%

thus 10 creates or writes is selected as a threshold to enable this
logic.

Lustre-change: https://review.whamcloud.com/45765
Lustre-commit: 9e6225b2e7385cbb7be0474df01075fafc4966d5

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I7c893fe3b95646b4b813b999bc832659dfcf03ad
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53485
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
17 months agoEX-7601 ofd: add decompress_read to ofd_preprw_write
Patrick Farrell [Fri, 3 Nov 2023 18:20:37 +0000 (14:20 -0400)]
EX-7601 ofd: add decompress_read to ofd_preprw_write

We have read up the compressed data from disk, now we must
decompress it so we can rewrite it successfully.

This code still works on the whole lnbs rather than just on
the portion of it which is unaligned.  This is temporary
and will be resolved by a future patch.

With this patch, we have basic read-modify-write support,
so we can re-enable testing.  The next patch adds tests
for read-modify-write.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ib6503c15e9fb3d425a7bc295bcc61b41c089a1f0
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52983
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
17 months agoEX-7601 ofd: add read to write process
Patrick Farrell [Thu, 2 Nov 2023 22:03:15 +0000 (18:03 -0400)]
EX-7601 ofd: add read to write process

This adds a very simple read to the write process, which
just reads up the entire chunk-rounded write range.

This is a first step - the read will eventually be modified
to only read the unaligned portions which must be
decompressed for read-modify-write.  We will create a
special lnb mapping which contains only the pages which must
be read for decompression (similar to the tx lnb mapping).

For now, this read allows us to test decompression without
handling the mapping.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I169ddc2e161094aebdad1a60ec62e9c1d75cd6d8
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52966
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
17 months agoEX-7601 ofd: add chunk rounding to write
Patrick Farrell [Thu, 2 Nov 2023 21:40:38 +0000 (17:40 -0400)]
EX-7601 ofd: add chunk rounding to write

For compressed files, we need to round all niobufs to
chunk size in the write process, so we have buffers for
reading in and rewriting the complete chunks.

dt_bufs_get sets up the local niobuf for the write, so we
round before calling it.

Note this breaks writing to compressed files, which is not
fixed until a few patches later.  For this reason, we
disable the compression tests.  They will be reenabled
shortly - similar to how we handled the read series.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I413aaba9866dd7d6c4463fa620eadf1423379ba1
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52963
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
17 months agoEX-7601 vvp: remove unaligned write restriction
Patrick Farrell [Fri, 3 Nov 2023 16:29:58 +0000 (12:29 -0400)]
EX-7601 vvp: remove unaligned write restriction

This series will resolve the unaligned write issue for
compressed files, so we need to remove the restriction on
unaligned writes in order to test it.

This does not mean unaligned writes are working yet, but we
need to make this change so the subsequent patches can be
tested.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I4abcbdcd18b00718099483c8dfdb9a7aa41c3ce7
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52981
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
17 months agoEX-7601 ofd: switch preprw to chunk_bits
Patrick Farrell [Fri, 3 Nov 2023 16:33:20 +0000 (12:33 -0400)]
EX-7601 ofd: switch preprw to chunk_bits

The compression/decompression code requires chunk_bits
rather than chunk size.  Since we need to call this code
from ofd_preprw_write, we need chunk_bits there.

This modifies the functions so chunk_bits is available
there.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Id98e6d6364eeaaa7753a8aba059387e3e659d2a2
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52982
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
17 months agoEX-7601 tgt: add tx lnb for writes
Patrick Farrell [Thu, 2 Nov 2023 21:55:01 +0000 (17:55 -0400)]
EX-7601 tgt: add tx lnb for writes

With compression, the lnbs used for the disk IO on the
server can contain more data than the client requested,
due to reading up whole chunks for decompression.

This means the client is only going to write data in to
a subset of the lnbs used for io to storage.

We handle this the same way we do for reads:
We create a second set of lnbs just for the transfer, and
point these lnbs at the pages which will actually receive
data from the client.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I5b668547537698309792daf309842866be79f0b6
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52965
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
17 months agoEX-7601 tgt: add remote_pages for writes
Patrick Farrell [Thu, 2 Nov 2023 21:49:00 +0000 (17:49 -0400)]
EX-7601 tgt: add remote_pages for writes

When we round a write to get all of the compressed chunks,
the number of local and the number of remote pages will
differ.  We need to make sure we do the checksum and data
transfer using the number of remote pages, not the number of
local pages.

This patch calculates the number of remote pages and uses it
accordingly.

Note that just like on the read side, this patch doesn't do
anything until we're actually rounding the chunks for IO in
a later patch.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I38256070d68246613ce67b0bfe328f6443a95533
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52964
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
17 months agoEX-7601 ofd: round range locking
Patrick Farrell [Mon, 20 Nov 2023 00:18:37 +0000 (19:18 -0500)]
EX-7601 ofd: round range locking

The range locking in OFD needs to be rounded for
compressed chunks.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I530d7f655a1c09033b1a3668c009072874ab1d18
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53178
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
17 months agoEX-7601 tgt: round write lock to chunk
Patrick Farrell [Thu, 2 Nov 2023 21:31:41 +0000 (17:31 -0400)]
EX-7601 tgt: round write lock to chunk

For unaligned writes, we need to round the write locking to
cover the any leading or trailing chunks.  We do this by
creating a local 'remote niobuf' to describe the rounded
range and doing the locking against that niobuf.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I2bdea620386ad229375647a0e2cc6180c9bd7aa6
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52961
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
17 months agoEX-7601 tgt: identify writes to round
Patrick Farrell [Thu, 2 Nov 2023 21:26:58 +0000 (17:26 -0400)]
EX-7601 tgt: identify writes to round

If the beginning or end of a client write is unaligned, we
must round the locking.  This patch identifies writes where
this is required, the next patch will do the locking.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Iec140c24423a0da478f6d42ff6fc620d7ad3ba4a
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52960
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
17 months agoEX-7601 ofd: clear pages in decompression
Patrick Farrell [Thu, 30 Nov 2023 19:59:07 +0000 (14:59 -0500)]
EX-7601 ofd: clear pages in decompression

Handling writes to compressed files requires a
read-modify-write cycle, which has implications for how we
handle reads.

Consider the case of a file with an 8 KiB write at offset 0,
which is compressed to 4 KiB.  Then there is another 4 KiB
write at offset 16 KiB.

Updating this correctly requires reading the first chunk,
then decompressing it.  However, this read will go past
EOF, because the write has not occurred yet.  The OSD read
code does not fill in these pages, because read past EOF is
not returned to the client (client gets a short read and
does not actually use the pages).

In our case, however, we must use these pages (from 8 KiB
16 KiB).  In the naive version without recompression, we
simply write out 0 - 16 KiB, so we must have zeroes in
those pages, and once we have recompression, we must
compress those pages so we need zeroes in that case too.

So we note if a page has data in it after decompression,
then if it does not, we clear the page.  Note we do NOT set
lnb_rc to 0 when we clear a page, because lnb_rc = 0 is
used to indicate EOF rather than a gap in the file.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: If1d1360185eb087e821167a08e49c9427e29ffc4
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53302
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
17 months agoEX-7601 obd: do not decompress empty lnbs
Patrick Farrell [Tue, 12 Dec 2023 20:54:37 +0000 (15:54 -0500)]
EX-7601 obd: do not decompress empty lnbs

For reads which cross EOF, we may get lnbs with no data in
them (similarly for writes which cross EOF).

For these cases, it's important to only copy from the lnbs
where there is data, and only do decompression on the lnbs
if there's actually data in them.

Modify merge chunk to do this.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I83fefcfa6d1396dcd97fad994334bf29438bb4bf
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53430
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
17 months agoEX-7601 obd: add error check in merge_chunk
Patrick Farrell [Wed, 29 Nov 2023 01:45:43 +0000 (20:45 -0500)]
EX-7601 obd: add error check in merge_chunk

If the lnbs we're trying to merge have an error recorded in
them, then they're not going to be valid input for
decompression, so return an error.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I1bf17131cb65106087eb5e72e2700db30c0cc975
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53274
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
17 months agoEX-7644 mmap: add mmap support for compression
Patrick Farrell [Wed, 25 Oct 2023 16:15:59 +0000 (12:15 -0400)]
EX-7644 mmap: add mmap support for compression

This removes the EOPNOTSUPP for compression with mmap and
adds an mmap sanity test for compression.  This patch
removes all the restrictions for mmap, but we actually only
have unaligned read support right now, so the test is
deliberately simplified to only test reads.

A more complicated version which also tests mmap writes
comes later in the series, once read-modify-write is
supported.

The test tests mmap by copying data at several different
block sizes with several different compression chunk sizes.

Test-Parameters: testlist=sanity-compr env=ONLY="1003",ONLY_REPEAT=10
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I4a37b106831a903d90e8a8871e9a93baac4e201e
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52280
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
17 months agoLU-17317 gss: no cache flush for rsi and rsc
Sebastien Buisson [Tue, 5 Dec 2023 16:02:21 +0000 (17:02 +0100)]
LU-17317 gss: no cache flush for rsi and rsc

RPCSEC init and RPCSEC context caches hold gss-related information
of security contexts established between network peers. These cache
entries are tightly coupled with contexts handled in the sptlrpc layer
so they must not be purged directly. They are inserted into the cache
when sptlrpc security contexts are established, and removed when the
corresponding security contexts are destroyed.

Lustre-change: https://review.whamcloud.com/53377
Lustre-commit: 3615fa4a86be793652d53c94818c5aeb81e2257e

Test-Parameters: trivial
Test-Parameters: kerberos=true testlist=sanity-krb5
Test-Parameters: testgroup=review-dne-selinux-ssk-part-2
Fixes: 4daf43ac3c ("LU-17015 gss: support large kerberos token for rpc sec init")
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I903f75a4b5229286fcaed3e9d96b5eee7f653f15
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53334
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
17 months agoLU-17015 gss: remove legacy sunrpc-cache based gss caches
Sebastien Buisson [Thu, 14 Sep 2023 12:23:07 +0000 (14:23 +0200)]
LU-17015 gss: remove legacy sunrpc-cache based gss caches

Now that GSS caches are based on Lustre's internal upcall cache
mechanism, we can remove the legacy ones based on the sunrpc cache
implementation, as this code is unused.

We can also remove support for updated get_expiry() in Linux 6.3, as
this function is no longer used.

Lustre-change: https://review.whamcloud.com/52376
Lustre-commit: 8665ba238412f407963724413e137b89d5cd384f

Test-Parameters: trivial
Test-Parameters: kerberos=true testlist=sanity-krb5
Test-Parameters: testgroup=review-dne-selinux-ssk-part-2
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I98d8777d225c723ae061ef360011abfc092e09d8
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Xing Huang <hxing@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53443
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
17 months agoLU-17015 gss: avoid request replay
Sebastien Buisson [Fri, 13 Oct 2023 15:19:16 +0000 (17:19 +0200)]
LU-17015 gss: avoid request replay

Lustre's upcall cache has a retry mechanism in case the upcall was
interrupted or failed and we timed out waiting. In this case we do our
best to retry and do the upcall again.
But when the upcall cache is used for GSS contexts, the upcall cannot
be done twice with same data. The GSSAPI implements security measures
that forbids that kind of request replay, to prevent man-in-the-middle
attacks for instance.

Add a new uc_acquire_replay field to struct upcall_cache, so that
upcall cache users can tell if acquire upcall can be replayed.
For identity upcall, this replay is fine. But for GSS contexts we need
to avoid those replays.
And bump upcall cache timeout value from 20s to 30s for GSS context
init requests.

Also add more debug messages to gss code for both client and server
sides, and both kernel and userspace.

Lustre-change: https://review.whamcloud.com/52689
Lustre-commit: d0194a4b5f6efa26d5473c2793b525f5fdb77e67

Test-Parameters: kerberos=true testlist=sanity-krb5
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I56decc83a4f0d21be420e87cb0417826011932af
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Aurelien Degremont <adegremont@nvidia.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53255
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
17 months agoLU-17015 gss: support large kerberos token for rpc sec ctxt
Sebastien Buisson [Thu, 7 Sep 2023 07:33:36 +0000 (09:33 +0200)]
LU-17015 gss: support large kerberos token for rpc sec ctxt

If the current Kerberos setup is using large token, like when PAC
feature is enabled for Kerberos, authentication can fail due to server
side unable to exchange token between kernel and userspace.
This limitation is inherent to the sunrpc cache mechanism, that can
only handle tokens up to PAGE_SIZE.

For RPC sec context phase, use Lustre's upcall cache mechanism
instead of deprecated kernel's sunrpc cache. Note this phase does not
involve a proper upcall, only the downcall part is relevant to
populate the context computed in userspace.

Lustre-change: https://review.whamcloud.com/52305
Lustre-commit: 473a41fec6fb600c9b6e26010d88772f5252d1e1

Test-Parameters: kerberos=true testlist=sanity-krb5
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I94e945a99cab60d5b6a4c40076c40fffede217ab
Reviewed-by: Aurelien Degremont <adegremont@nvidia.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53254
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
17 months agoLU-17317 gss: do not continue using expired reverse context
Sebastien Buisson [Fri, 8 Dec 2023 08:05:04 +0000 (09:05 +0100)]
LU-17317 gss: do not continue using expired reverse context

In case a server uses an expired gss context to send a callback
request to a client, it might be that the associated context on
the client has already expired, and been purged from the cache.
This results in a GSS_S_NO_CONTEXT reply.
In this specific scenario, the server must mark its reverse context
as dead. This will lead to destruction of the expired context, and
creation of a new context suitable for further callback requests.

Lustre-change: https://review.whamcloud.com/53375
Lustre-commit: TBD (65f91673262098aa6d97448f68a036b0f2cdfd98)

Test-Parameters: kerberos=true testlist=sanity-krb5
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I4af90cd70a3815851ec555ea85b49714c8da4202
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53369
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
17 months agoRM-620 build: New tag 2.14.0-ddn125
Andreas Dilger [Wed, 20 Dec 2023 08:55:47 +0000 (01:55 -0700)]
RM-620 build: New tag 2.14.0-ddn125

New tag 2.14.0-ddn125

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ic2e4b53c8540ffd039359565c06294645a62d328

17 months agoLU-13791 mdt: parameter to tune capabilities
Andreas Dilger [Tue, 19 Dec 2023 02:07:58 +0000 (19:07 -0700)]
LU-13791 mdt: parameter to tune capabilities

Add mdt.*.enable_cap_mask to allow specific capabilities to
be enabled and disabled individually.

Fixes: f05edf8e2b ("LU-13791 sec: enable FS capabilities")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I6fc0130a90693d673d8c2158e7e31c2de951553d
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53500
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
17 months agoRM-620 build: New tag 2.14.0-ddn124
Andreas Dilger [Tue, 19 Dec 2023 06:10:54 +0000 (23:10 -0700)]
RM-620 build: New tag 2.14.0-ddn124

New tag 2.14.0-ddn124

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ib274a425044bfbb22bc40bd51ccfda06ad6ba8b0

17 months agoLU-930 docs: fix whatis output
Timothy Day [Sun, 12 Mar 2023 15:19:54 +0000 (15:19 +0000)]
LU-930 docs: fix whatis output

The ".SH NAME" section has to be formatted in a certain
way for whatis and apropos to work correctly. Otherwise,
users will just see "(unknown subject)".

This patch fixes issues for all man pages.

Add a couple of one-line man page redirects.

Lustre-change: https://review.whamcloud.com/50264
Lustre-commit: 17bbf5bdd6f96f61dc0e39924dce540e91e1422c

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Ie11eb921c84ff9ad19b50973c616f6fb6df1f461
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53474
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
17 months agoLU-12837 doc: add lfs-changelog* manpages
Etienne AUJAMES [Tue, 22 Nov 2022 12:39:25 +0000 (13:39 +0100)]
LU-12837 doc: add lfs-changelog* manpages

This patch moves the documentation for "lfs changelog" and "lfs
changelog_clear" utilities from "lfs.1" to the following manpages:
- lfs-changelog.1
- lfs-changelog_clear.1

Lustre-change: https://review.whamcloud.com/49209
Lustre-commit: 82e7ad348c77e5c164aa3e3155c9eb91872369d5

Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Signed-off-by: Xing Huang <hxing@ddn.com>
Test-Parameters: trivial
Change-Id: I6db2e687e506a6116fe4755358a9abbd5509c3bb
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53471
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
17 months agoLU-14651 build: fix build for el7.9 kernels
Andrew Perepechko [Mon, 18 Dec 2023 18:19:26 +0000 (11:19 -0700)]
LU-14651 build: fix build for el7.9 kernels

Handle extra setattr_prepare() argument added in Linux 5.12 kernels
when building on older kernels.

Lustre-change: https://review.whamcloud.com/53503
Lustre-commit: TBD (from cc03199c61df217f7da249d9f9f3419e0333c671)

HPE-bug-id: LUS-12059
Signed-off-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Change-Id: Ie7fd1c4d51b7a9b086cfca0db941321cbcce7057
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53494
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
18 months agoRM-620 build: New tag 2.14.0-ddn123
Andreas Dilger [Fri, 15 Dec 2023 03:52:43 +0000 (20:52 -0700)]
RM-620 build: New tag 2.14.0-ddn123

New tag 2.14.0-ddn123

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I33af32d0a44376aee90286496939c4bcb114abd8

18 months agoLU-17366 kernel: update SLES15 SP5 [5.14.21-150500.55.39.1]
Jian Yu [Thu, 14 Dec 2023 19:38:42 +0000 (11:38 -0800)]
LU-17366 kernel: update SLES15 SP5 [5.14.21-150500.55.39.1]

Update SLES15 SP5 kernel to 5.14.21-150500.55.39.1 for Lustre client.

Lustre-change: https://review.whamcloud.com/53467
Lustre-commit: TBD (from 7084f80ec256f6a7335fe4d5981db1e8bcbed440)

Test-Parameters: trivial mdtcount=4 mdscount=2 \
clientdistro=sles15sp5 testlist=sanity

Change-Id: Id9476e8726728b00d4079cdaf31b081f89190eb1
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53468
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>