Whamcloud - gitweb
fs/lustre-release.git
18 months agoRM-620 build: New tag 2.14.0-ddn120
Andreas Dilger [Thu, 7 Dec 2023 11:13:42 +0000 (04:13 -0700)]
RM-620 build: New tag 2.14.0-ddn120

New tag 2.14.0-ddn120

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I59aeafd089ff479f3ff735a04a805ec99ecadfdb

18 months agoLU-16392 utils: use --list-commands for bash completion
Thomas Bertschinger [Wed, 21 Dec 2022 16:52:50 +0000 (11:52 -0500)]
LU-16392 utils: use --list-commands for bash completion

The CLI utils lctl and lfs currently use a pseudo option
--non-existent-option to generate a list of completions. However, this
was broken when the help output for an invalid command was changed.
Using --list-commands instead means that the format of the help output
can be kept succinct.

However, currently there are 2 issues that make --list-commands
unsuitable.

First, --list-commands truncates long commands. This commit resolves
this by not truncating long commands, and removing the fixed-length
char buffer and writing directly to stdout so that the line length
can overflow slightly if needed.

Second, --list-commands recursively displays sub-commands. For
example, for `lctl`, it will display `pcc add`, `pcc del`, etc in
additon to just `pcc`. The bash completion tools would view these
as separate tokens and thus would inappropriately suggest `add`,
`del`, etc. as completions for `lctl`. This commit removes the
recursive behavior.

Removing the recursive behavior resolves an unrelated bug with the
recursion that can be observed for `lctl`, where a number of
top-level commands are skipped following recursion into a previous
sub-command, equal to the number of subcommands processed in the
recursive call. Specifically, the commands in the section "device
setup", e.g. `attach`, `detach`, were not displayed following the
recursive call into `pcc`.

Finally, this commit changes the command parser to recognize --help
and print the list of commands when this argument is seen.

Lustre-change: https://review.whamcloud.com/49484
Lustre-commit: b4cc570ad11c1c07a6e1d825787ccc62c1245ca1

Fixes: bc69a8d058 ("LU-8621 utils: cmd help to stdout or short cmd error")
Signed-off-by: Thomas Bertschinger <bertschinger@lanl.gov>
Change-Id: Ib6e139402b9cd18e5a54b8fd3d6a2652d301e736
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53337
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
18 months agoEX-7784 tests: reenable arm testing
Patrick Farrell [Wed, 22 Nov 2023 20:57:52 +0000 (15:57 -0500)]
EX-7784 tests: reenable arm testing

Previously, test 460a failed every time on ARM systems with
an issue with lnet/lnb transfers.

After a significant rework of the client compression code
for EX-7601, this no longer happens.

Test-Parameters: trivial testlist=sanity clientdistro=el8.8 clientarch=aarch64
Test-Parameters: trivial testlist=sanity clientdistro=el8.8 clientarch=aarch64
Test-Parameters: trivial testlist=sanity clientdistro=el8.8 clientarch=aarch64
Test-Parameters: trivial testlist=sanity clientdistro=el8.8 clientarch=aarch64
Test-Parameters: trivial testlist=sanity clientdistro=el8.8 clientarch=aarch64
Test-Parameters: trivial testlist=sanity clientdistro=el8.8 clientarch=aarch64
Test-Parameters: trivial testlist=sanity clientdistro=el8.8 clientarch=aarch64
Test-Parameters: trivial testlist=sanity clientdistro=el8.8 clientarch=aarch64
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I0490a2e7cbadb1492b58eb27c6bf8001b0704b5b
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53201
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
18 months agoLU-17278 ldlm: don't grant failed lock
Alex Zhuravlev [Thu, 9 Nov 2023 13:29:03 +0000 (16:29 +0300)]
LU-17278 ldlm: don't grant failed lock

lock convert can re-grant lock if it loses some bits. this
procedure can race with the import's invalidation. thus
lock can become invalid (l_granted_mode=LCK_MINMODE):
LustreError: 8637:0:(ldlm_lock.c:1095:ldlm_grant_lock_with_skiplist())
ASSERTION( ldlm_is_granted(lock) )

Lustre-change: https://review.whamcloud.com/53051
Lustre-commit: f3b45a05475d8c65f06c81f41176b5a7f7d1acaa

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I7bb20d62948224647d7632f2822fba44d39a7713
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53286
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
18 months agoLU-17325 o2iblnd: CM_EVENT_UNREACHABLE on established conn
Serguei Smirnov [Thu, 30 Nov 2023 18:55:11 +0000 (10:55 -0800)]
LU-17325 o2iblnd: CM_EVENT_UNREACHABLE on established conn

There were examples in the field with RoCE setups which demonstrate
that CM_EVENT_UNREACHABLE may be received when connection is already
in ESTABLISHED state. This causes an assert in kiblnd_cm_callback to
fail.

Handle this in a more gracious manner: report the event as unexpected
and allow the flow to continue. If there are indeed issues on
the connection, it is expected to report transaction errors later
and get cleaned up without crashing the whole system.

Lustre-change: https://review.whamcloud.com/53298
Lustre-commit: TBD (from cbde71bf893dba0de752a190c3b16d653ef75085)

Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: If32166fe9fc59e025609c2035cb1c03d3bed22f2
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53301
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
18 months agoLU-14928 mgs: allow md target re-register
Alexander Zarochentsev [Sun, 30 May 2021 13:43:05 +0000 (16:43 +0300)]
LU-14928 mgs: allow md target re-register

In a DNE system, it is not safe to do writeconf of
a MD target and attempt to mount (and re-register) it again,
as it creates a weird MDT-MDT osp devices like
fsname-MDT0001-osp-MDT0001" and makes the system non-functioning.
The fix doesn't allow creation of illegal devices.

Lustre-change: https://review.whamcloud.com/44594
Lustre-commit: e4f3f47f04c762770bc36c1e3fa7e92e94a36704

HPE-bug-id: LUS-10098
Signed-off-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Change-Id: I698ee6d70ac96f54eaec57b5c5fe553d130ba011
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Artem Blagodarenko <artem.blagodarenko@hpe.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53328
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
18 months agoLU-15112 mgc: do not ignore target registration failure
Alexander Zarochentsev [Wed, 15 Dec 2021 10:26:02 +0000 (13:26 +0300)]
LU-15112 mgc: do not ignore target registration failure

A serious target registation failure with LDD_F_ERROR
flag set is ignored by target, it makes possible
registreting new target with already used index;
Writeconf flag should be encoded in fs label regardless
the "first_time" flag, otherwise target cannot be registered
after initial registration failure.

Lustre-change: https://review.whamcloud.com/45259
Lustre-commit: cefabee52586f443bfd5163f6ac0b5e1b56a9db7

HPE-bug-id: LUS-8752
Signed-off-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Change-Id: If051199d3dbafc8f8102f3daf086de01bc5c5f98
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53340
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
18 months agoLU-15112 ptlrpc: make rq_replied flag always correct
Alexander Zarochentsev [Wed, 15 Dec 2021 12:31:47 +0000 (15:31 +0300)]
LU-15112 ptlrpc: make rq_replied flag always correct

rq_replied flag is cleared at ptl_rpc_send() only,
so state of the flag may be incorrect for rpcs which
are timed out but have have been never sent.

Lustre-change: https://review.whamcloud.com/45871
Lustre-commit: 94f3f1b511609fa190cee64c7e8244f21ef70792

HPE-bug-id: LUS-8752
Signed-off-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Change-Id: I0de996a4d775b8f1a1a6b27ff38d21645694f868
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53329
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
18 months agoEX-8498 osd: const create in osd_ldiskfs_map_inode_pages()
Alex Zhuravlev [Mon, 30 Oct 2023 08:08:57 +0000 (11:08 +0300)]
EX-8498 osd: const create in osd_ldiskfs_map_inode_pages()

create flag is used to skip reads of unwritten blocks so don't
use/modify it to enable dense writes.

Fixes: f36eda6a1e ("LU-10026 osd-ldiskfs: use preallocation for dense writes")
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I63a08ae2b8ed30d8a8ef4c5570f05d300a2b430b
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52887
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
18 months agoLU-17334 lov: handle object created on newly added OST
Andreas Dilger [Wed, 6 Dec 2023 18:32:57 +0000 (10:32 -0800)]
LU-17334 lov: handle object created on newly added OST

When a new OST is added to a filesystem without no_create,
then a new object created on the OST relatively quickly
after it is added to the filesystem, in particular because
the new OST would be preferred by QOS space balancing
due to lots of free space. However, it might take a few
seconds for the addition of the new OST to be propagated
across all of the clients, so there is a risk that the MDS
creates file object on OSTs that a client is not yet aware of,
which returns an error to the application immediately.

This patch fixes the issue by adding a loop in lsme_unpack()
that is waiting and retrying for some number of seconds for
the filesystem layout to be updated if either the
"loi->loi_ost_idx >= lov->desc.ld_tgt_count" or "!ltd"
condition is hit.

Lustre-change: https://review.whamcloud.com/53335
Lustre-commit: TBD (from e1de624373ce6082253ddbdd987d36eb56ca6490)

Change-Id: Idc29b8c66079afaea25428577daf51370fa2b084
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53353
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
18 months agoLU-17337 osd: ask for more revoke credits
Alex Zhuravlev [Tue, 5 Dec 2023 05:20:58 +0000 (08:20 +0300)]
LU-17337 osd: ask for more revoke credits

starting from 4.* kernels JBD2 tracks number of potential
revoked blocks separately from regular journal blocks and
checks a transaction doesn't exceed the declared number.
before extent merging patch a regular block allocation could
free only very limited number of blocks. now with extent
merging when an extent tree is really big and few extents
are inserted in a single transaction, then such an allocation
can exceed default revoke credits (8).
the patch uses number of extent in the transaction to calculate
potential number of revoke records (max tree depth * default).

Fixes: 0f7e6c02a9 ("LU-16843 ldiskfs: merge extent blocks")
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I4967deb56e5aba82b68ffdc91de589fffae6a64a
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53325
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
18 months agoRM-620 build: New tag 2.14.0-ddn119
Andreas Dilger [Thu, 30 Nov 2023 17:19:08 +0000 (10:19 -0700)]
RM-620 build: New tag 2.14.0-ddn119

New tag 2.14.0-ddn119

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I16137e4ed48ff6a28d9a33b9206ad6c5acab3c34

18 months agoEX-7601 ofd: add remote_pages
Patrick Farrell [Sat, 28 Oct 2023 20:34:25 +0000 (16:34 -0400)]
EX-7601 ofd: add remote_pages

When we round a read to get all of the compressed chunks,
the number of local and the number of remote pages will
differ.  We need to make sure we do the checksum and data
transfer using the number of remote pages, not the number of
local pages.

This patch calculates the number of remote pages and uses it
accordingly.  This doesn't do anything yet, but when we
round the local read to include the whole compressed chunk,
this will be needed.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I4875b02016570d227b3b926efd117f0a7cda41b4
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52878
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
18 months agoEX-7601 ofd: add chunk_size to preprw_read
Patrick Farrell [Fri, 27 Oct 2023 19:29:24 +0000 (15:29 -0400)]
EX-7601 ofd: add chunk_size to preprw_read

preprw_read needs chunk size for rounding.  Add this in a
separate patch to keep things trivial, it will be used in
a subsequent patch.

Also use this to add a check in DOM to ensure it doesn't
attempt to do compression.  This should already be
prevented by setstripe, so this is just an extra safety
check.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I9dc4d1559e5c8be315268a593466571b54c90a96
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52866
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
18 months agoEX-7601 ofd: convert dt_bufs_get to offset, len
Patrick Farrell [Fri, 27 Oct 2023 19:22:12 +0000 (15:22 -0400)]
EX-7601 ofd: convert dt_bufs_get to offset, len

dt_bufs_get takes a remote niobuf, but just uses the
offset and length for getting pages.

Compression requires rounding the local IO to include the
full compression chunk, which means the local IO does not
match the remote niobuf any more.

So we modify dt_bufs_get to take an offset and length
rather than a remote niobuf, so we can ask for the pages we
need.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I4beaf8207fa00d802c0a339df3de2a3c71154fc7
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52865
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
18 months agoEX-7601 ofd: round read lock to chunk
Patrick Farrell [Fri, 27 Oct 2023 18:42:41 +0000 (14:42 -0400)]
EX-7601 ofd: round read lock to chunk

For unaligned reads, we need to round the read locking to
cover the any leading or trailing chunks.  We do this by
creating a local 'remote niobuf' to describe the rounded
range and doing the locking against that niobuf.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I8818522c188aca3c5c5eb564da2a8ba8aef18a4b
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52864
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
18 months agoEX-7601 ofd: identify reads to round
Patrick Farrell [Fri, 27 Oct 2023 18:37:11 +0000 (14:37 -0400)]
EX-7601 ofd: identify reads to round

If the beginning or end of a client read is unaligned, we
must round the locking.  This patch identifies reads where
this is required, the next patch will do the locking.

Print a debug message when such an IO is found, but don't
do anything different - yet.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ibdab35b733225b4b1349ef457f66ca37dcb2d9bf
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52863
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
18 months agoEX-7601 osc: handle partial chunks in decompress_request
Patrick Farrell [Mon, 27 Nov 2023 21:07:49 +0000 (16:07 -0500)]
EX-7601 osc: handle partial chunks in decompress_request

Now that we have compression for incomplete chunks at the
end of files, decompress_request needs to handle these
chunks.  This patch modifies it to understand compressed
chunks which are less than chunk_size pages.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I877550fa0d418def406e0308392a5336ec9f3ab6
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53160
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
18 months agoEX-7601 osc: rewrite compress_request
Patrick Farrell [Tue, 28 Nov 2023 03:35:49 +0000 (22:35 -0500)]
EX-7601 osc: rewrite compress_request

The existing version of compress_request can't handle
discontiguous RPCs.  Rewrite the logic to handle this
case properly.

This also implements kms handling.

If a write chunks ends at the known minimum size, we know
this write is after all other data in the file and so
there is no compressed data under it.  This means we can
compress this chunk.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I8a912d9e279d04c8ff07de39e63a1ec9b490d921
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53111
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
18 months agoLU-16804 tests: load CONFIG at beginning of init_test_env
Sebastien Buisson [Wed, 10 May 2023 12:13:54 +0000 (14:13 +0200)]
LU-16804 tests: load CONFIG at beginning of init_test_env

In order to have all environment variables properly loaded, make
CONFIG loaded at the beginning of init_test_env().

Lustre-change: https://review.whamcloud.com/50914
Lustre-commit: fdbb2bc8495064e1d9e61f02bcfd13b1e6aec8da

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I1c3caa3d582c4b317ff3d0d10fc0103e046ddf17
Reviewed-by: Sarah Liu <sarah@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53250
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
18 months agoLU-16784 tests: fix path to lgss_sk
Sebastien Buisson [Mon, 1 May 2023 23:44:18 +0000 (16:44 -0700)]
LU-16784 tests: fix path to lgss_sk

Find correct path to lgss_sk utility, by looking inside Lustre build
tree if command is not installed on the local node.

Lustre-change: https://review.whamcloud.com/50825
Lustre-commit: 1ba12d98d5b068083fbb855b287d0b6da0ada80d

Test-Parameters: trivial
Test-Parameters: mdscount=2 mdtcount=4 osscount=1 ostcount=8 clientcount=2 testlist=sanity-sec env=SHARED_KEY=true
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I23920bb2a44d2ec7e9662e75c23bd5302d8dfee2
Reviewed-by: Sarah Liu <sarah@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53251
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
18 months agoLU-17230 socklnd: treat UNKNOWN netif operstate as UP
Serguei Smirnov [Thu, 26 Oct 2023 18:15:28 +0000 (11:15 -0700)]
LU-17230 socklnd: treat UNKNOWN netif operstate as UP

"UNKNOWN" (IF_OPER_UNKNOWN) operational state doesn't necessarily
mean that the interface can't be used and may be the result of
particular network driver not providing UP/DOWN states,
so it may be incorrect for socklnd to initiate
setting of a "fatal error" flag on a NI using an interface
in "UNKNOWN" operstate.

Lustre-change: https://review.whamcloud.com/52842
Lustre-commit: 6897dbe67c0d7d7554926128a17c65afa1ec0001

Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: I39dfa01f3758809440d50cf8b6b11555889ef366
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53285
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
18 months agoRM-620 build: New tag 2.14.0-ddn118
Andreas Dilger [Mon, 27 Nov 2023 18:49:46 +0000 (11:49 -0700)]
RM-620 build: New tag 2.14.0-ddn118

New tag 2.14.0-ddn118

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I40b73ac045de1c00d39691913f81a9e4dccdb72b

18 months agoRM-620 build: New tag lipe-2.37
Andreas Dilger [Mon, 27 Nov 2023 18:47:50 +0000 (11:47 -0700)]
RM-620 build: New tag lipe-2.37

New tag lipe-2.37

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I6a55ba0178cc2e2f2bd7566ce3de5f7b231692c5

18 months agoEX-7601 osc: calculate compressed size reduction accurately
Patrick Farrell [Mon, 20 Nov 2023 02:50:21 +0000 (21:50 -0500)]
EX-7601 osc: calculate compressed size reduction accurately

Compression reduces space used if it results in allocating
at least one fewer block on disk.  Modify the checks in
compress_chunk to reflect this, rather than using the
simpler "reduce size by at least 4K" calculation.

Also do not attempt to compress chunks if they are less
than 4K in size, since they can't possibly get a space
benefit.

This improved my measured ratio on a version of the Linux
kernel source data set from 1.24 to 1.56, so this is
significant for datasets with many small files.  (This
version of the source had large incompressible files
removed, to focus on smaller files.  The unmodified data set
would not improve as much.)

Note this is still short of our estimates, so either the
estimate or Lustre still needs adjustment.  TBD.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I815706914b88de4f532a674d773769aa3a64d218
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53181
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
18 months agoEX-7601 ofd: series description and reorder declarations
Patrick Farrell [Tue, 31 Oct 2023 15:11:01 +0000 (11:11 -0400)]
EX-7601 ofd: series description and reorder declarations

Reorder declarations in tgt_brw_read prior to adding things.

This trivial patch is a good place to put the description
of this series, which handles unaligned reads to compressed
files.

-----------------------
These patches handle compression chunk unaligned reads on
the server.

When using compression, the client attempts to send chunk
aligned reads, but sometimes it can't, and the client will
send a read to the server which is not chunk aligned.

In this case, the server must read the full chunk,
decompress it, and provide the requested data to the client.

Here's how we do this.

The server receives a set of remote niobufs describing IO
from the client.  Each remote niobuf (rnb) describes a range
of data the client wants to do IO to.

These are translated to a set of local niobufs on the
server, which we then use to do the read.  For compression,
the server has to read complete chunks on unalinged reads.

So we walk these remote niobufs and identify unaligned read
requests (in ofd_preprw_read), then round them to chunk
size. The server then reads the chunk rounded read request
from storage.

The local niobufs now contain a set of complete compressed
chunks, ie, the raw data from disk.  We need to decompress
the chunks where the client is doing an unaligned read, but
leave the other chunks compressed (because the client will
uncompress them).

So, in obd_decompress_read, we use the remote niobuf to
identify unaligned reads from the client.  We then walk the
local niobufs, identify the chunks which match the unaligned
reads from the client, and decompress them 'in place'.
The decompression uses temporary buffers, but the
decompressed data is placed back in the local niobuf.
(If the data is uncompressed on disk, we of course do not
decompress it.  This happens for incompressible data.)

Now the local niobuf contains some raw chunks and some
chunks which have been decompressed.  This is *more* data
than the client asked for.  Normally, the server local
niobuf contains exactly what the client asked for, so the
server checksums and sends the entire local niobuf.  But
because we read complete chunks, the local niobuf contains
more data than the client requested.

This means we need to identify the subset of the local
niobuf which the client actually wants to read and present
that to the client.

In order to do that, we walk the local niobuf and use the
remote niobufs (the description of the pages the client
needs) and create a special tx niobuf which points to only
the pages the client wants (io_lnb_to_tx_lnb).  Then we use
this tx niobuf for checksum and transfer to the client.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ic89dcef7e169879725caa6cdef4619b9a76b2b37
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52915
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
18 months agoEX-7601 ofd: rename map_remote_to_local
Patrick Farrell [Fri, 27 Oct 2023 19:01:06 +0000 (15:01 -0400)]
EX-7601 ofd: rename map_remote_to_local

osd_map_remote_to_local implies some complex role, but in
fact what this does is initialize the fields of the
local niobuf structs to represent the requested range.

This *may* be the same as a remote niobuf, but it also
isn't in some cases.  Name it osd_init_lnbs instead.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I0d3b5f24e42ee8dc962437daea7cf9347ccb9059
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52861
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
18 months agoEX-7601 ofd: rename 'local' in thread_big_cache
Patrick Farrell [Sat, 28 Oct 2023 17:48:04 +0000 (13:48 -0400)]
EX-7601 ofd: rename 'local' in thread_big_cache

It's not a big deal since it's only used a few times, but
let's give this variable a descriptive name.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ide136cd42e885d59f1a2e4ce22a2e7449faca3f9
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52874
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
18 months agoEX-7601 ofd: do not overwrite rc in unmerge_chunk
Patrick Farrell [Sat, 28 Oct 2023 20:30:56 +0000 (16:30 -0400)]
EX-7601 ofd: do not overwrite rc in unmerge_chunk

unmerge_chunk should not be responsible for setting the
lnb rc, because this overwrites the result of any previous
activity on the lnb.  Plus, unmerge_chunk can't fail.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Id1ce590c7f1da3ab7faddbd685d264a33c08d639
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52876
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
18 months agoEX-7601 osc: allow multiple chunks in read
Patrick Farrell [Thu, 16 Nov 2023 23:26:26 +0000 (18:26 -0500)]
EX-7601 osc: allow multiple chunks in read

It's rare, but reads can sometimes have multiple
discontiguous chunks.  Update decompress_request to
handle this case.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I880af95db285dce76db3610e8140a0f54baa401b
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53159
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
18 months agoEX-7601 osc: rename pages_in_chunk
Patrick Farrell [Thu, 16 Nov 2023 20:37:57 +0000 (15:37 -0500)]
EX-7601 osc: rename pages_in_chunk

Chunks can have variable numbers of pages in them.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: If199d777367569e62c21305f6e4b9f3e4cce6d06
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53158
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
18 months agoEX-7601 obd: move type switching to alloc_compr callers
Patrick Farrell [Sat, 11 Nov 2023 20:39:22 +0000 (15:39 -0500)]
EX-7601 obd: move type switching to alloc_compr callers

The code is much cleaner if we can eliminated applied type
and handle that issue once per compression or decompression
rather than for every chunk.  This requires moving the type
switching inside alloc_compr.  (Also improve some error
messages - alloc_compr can fail with ENOMEM as well.)

The compression code currently allocates a transform for
every chunk on the client.  This is relatively cheap, but
it also complicates the code by repeatedly checking if a
particular compression type is supported (this is the
"applied type" code).

Moving alloc_compr to compress/decompress request makes the
code much simpler.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I162e81577db721a9715d57b3f262fcabbcbf308a
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53103
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
18 months agoEX-8590 lipe: Use only one client for the test
Alexandre Ioffe [Sun, 26 Nov 2023 19:42:14 +0000 (11:42 -0800)]
EX-8590 lipe: Use only one client for the test

Use only one client machine for hot-pools tests 75a, b, c.

Test-Parameters: trivial testlist=hot-pools
Test-Parameters: trivial testlist=hot-pools env=ONLY=75a
Test-Parameters: trivial testlist=hot-pools env=ONLY=75b
Test-Parameters: trivial testlist=hot-pools env=ONLY=75c
Test-Parameters: trivial testlist=hot-pools env=ONLY=75a
Test-Parameters: trivial testlist=hot-pools env=ONLY=75b
Test-Parameters: trivial testlist=hot-pools env=ONLY=75c
Signed-off-by: Alexandre Ioffe <aioffe@ddn.com>
Change-Id: Icfa958474ec928faeec63029a2d5983cea650bb7
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53240
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
18 months agoEX-8466 tests: limit 'cmp' output in sanity-pcc.sh
Andreas Dilger [Sat, 25 Nov 2023 05:27:36 +0000 (22:27 -0700)]
EX-8466 tests: limit 'cmp' output in sanity-pcc.sh

Limit the number of lines printed by 'cmp' when there is an error
comparing two files.  Often the files are multiple MB in size, and
printing 1-32M lines of output when the test fails is not useful.

Instead, print the first 66000 lines of output by default, which is
enough to see a full 64KiB plus some lines to see if more than 64KiB
of data is incorrect.  This is controlled by the CMP_LINES variable.

Test-Parameters: trivial testlist=sanity-pcc
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I80f4d5d3460d531ab63788185a2c88e79415a801
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53239
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
18 months agoLU-17312 tests: skip conf-sanity test_53 in interop
Andreas Dilger [Fri, 24 Nov 2023 07:48:44 +0000 (00:48 -0700)]
LU-17312 tests: skip conf-sanity test_53 in interop

Skip conf-sanity test_53 in interop because older servers cannot
stop any running service threads above threads_max.

Remove old test interop for servers < 2.3.

Lustre-change: https://review.whamcloud.com/53226
Lustre-commit: TBD (from d029a1cb45ac440e580c177866f0e9766444d8f1)

Test-Parameters: trivial testlist=conf-sanity
Test-Parameters: testlist=conf-sanity env=ONLY=53 serverversion=EXA5
Fixes: 183cb1e3cd ("LU-947 ptlrpc: allow stopping threads above threads_max")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ia95405060c607c7a070720ed32a7a43b1c3ebbe5
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53227
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
18 months agoEX-7600 osd: save compressed object size on zfs
Artem Blagodarenko [Thu, 2 Nov 2023 22:20:52 +0000 (22:20 +0000)]
EX-7600 osd: save compressed object size on zfs

"osc: save compressed object size" added means to transfer
object size to the osd and added ldiskfs support.

This patch adds saving objec size to the ZFS backend.
Currently this fix submitted as separete patch, for
testing purpouse, but can be marged to the main patch latter.

Signed-off-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Test-Parameters: trivial testlist=sanity fstype=zfs
Change-Id: I99e29e3f756a070b5f3cece12c4ca58f668a2ecf
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52958
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
18 months agoEX-8671 tests: use smaller files in sanity-pcc/103+104
Andreas Dilger [Thu, 23 Nov 2023 01:29:08 +0000 (18:29 -0700)]
EX-8671 tests: use smaller files in sanity-pcc/103+104

Running fallocate is fast, but the actual PCC data copy may be slow.
Use smaller test files for sanity-pcc test_103 and test_104 to speed
up testing, and also wait longer in case the copy is slow.

Add some extra debugging on failure so we can see the file attach
state on failure, in case there is something wrong with the parsing.

Test-Parameters: trivial testlist=sanity-pcc
Test-Parameters: testlist=sanity-pcc env=ONLY=103,ONLY_REPEAT=100
Test-Parameters: testlist=sanity-pcc env=ONLY=104,ONLY_REPEAT=100
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I09f159810a778b8ef2bab93d0e2869237a3ebbe5
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53212
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Feng Lei <flei@whamcloud.com>
18 months agoEX-7601 osd: osd_bufs_put does not always handle all pages
Patrick Farrell [Wed, 18 Oct 2023 17:40:51 +0000 (13:40 -0400)]
EX-7601 osd: osd_bufs_put does not always handle all pages

osd_bufs_put asserts that the dio pages used after are
always zero, but there's no reason for this to be true and
compression specifically violates this by using 1 page at
a time.

Without this patch, we hit this assert and crash when
nonrotational = 1.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: If6bdb11f254c260e2da4cabe11a82693a468e6fb
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52750
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
18 months agoLU-17251 osp: start OST object precreate earlier
Andreas Dilger [Sun, 26 Nov 2023 05:58:09 +0000 (22:58 -0700)]
LU-17251 osp: start OST object precreate earlier

If the OST object precreate count gets large (usually due to high
MDT file create workload, but sometimes also forced during testing)
then send an OST_CREATE RPC sooner when the number of precreated
objects gets low.

Currently the MDS will wait until 1/2 of the precreated OST objects
are consumed, but if create_count = 10000, then this can put bursty
create workloads on the OST.  Instead, send an OST_CREATE RPC when
the precreate pool is at most 1024 objects below target, so that the
MDS keeps its precreated pool more full and the OST doesn't have to
create so many objects at once (which also locks object directories
for a longer time).

Don't set opd_force_creation=true when osp.*.create_count is set
larger, and instead rely on the improved precreate check to force
OST object creation to start sooner, as opd_force_creation=true
can cause the OSP precreation to stop completely in some cases.

Lustre-change: https://review.whamcloud.com/53245
Lustre-commit: TBD (from 6ffb849d7086a2b2ae48f274d4f5b1b8fbf83fe2)

Test-Parameters: testlist=sanity env=ONLY=1-130,HONOR_EXCEPT=y
Test-Parameters: testlist=sanity env=ONLY=1-130,HONOR_EXCEPT=y
Test-Parameters: testlist=sanity env=ONLY=1-130,HONOR_EXCEPT=y
Test-Parameters: testlist=sanity env=ONLY=1-130,HONOR_EXCEPT=y
Test-Parameters: testlist=parallel-scale env=ONLY=rr_alloc,ONLY_REPEAT=10
Test-Parameters: testlist=parallel-scale env=ONLY=rr_alloc,ONLY_REPEAT=10
Test-Parameters: testlist=parallel-scale env=ONLY=rr_alloc,ONLY_REPEAT=10
Test-Parameters: testlist=parallel-scale env=ONLY=rr_alloc,ONLY_REPEAT=10
Fixes: df5b4c0a8b ("LU-17251 osp: force precreate if create_count grows")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Id2d12636d535485919ca5eec3adb18b1e6ce7057
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53244
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
18 months agoRM-620 build: New tag 2.14.0-ddn117
Andreas Dilger [Fri, 24 Nov 2023 09:35:29 +0000 (02:35 -0700)]
RM-620 build: New tag 2.14.0-ddn117

New tag 2.14.0-ddn117

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ida37fd66ddfd7331efbc3a2276ddaf0f574f5de5

18 months agoLU-16468 llite: protect layout before read IO going
Bobi Jam [Fri, 13 Jan 2023 04:36:01 +0000 (12:36 +0800)]
LU-16468 llite: protect layout before read IO going

It's possible that the before the read IO, file_read_confine_iter()
->lov_attr_get() to get proper kms (known minimum size of the file),
and lov_attr_get() presumes that it's called under ongoing IO, which
protected the layout from changing, while it's not in this case.

Lustre-change: https://review.whamcloud.com/49622
Lustre-commit: from e050b91c6c471d3576eba3bbf4f3c31aef644e3f

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I1b36ec6e158331e63e8026ee2b986d5a7e3cb6dc
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49623
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
18 months agoEX-8386 lipe: Remove cruft from systemd services
Nathaniel Clark [Wed, 15 Nov 2023 21:09:10 +0000 (16:09 -0500)]
EX-8386 lipe: Remove cruft from systemd services

Remove After=rust-iml-agent.service

rust-iml-agent is deprecrecated and not longer needed.

Change-Id: Icd0e79dbd417e98beb07f8546487d20fa5f6bb62
Signed-off-by: Nathaniel Clark <nclark@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53152
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Feng Lei <flei@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
18 months agoEX-8282 lfs: migrate compressed file without stripe info
Bobi Jam [Wed, 8 Nov 2023 10:41:21 +0000 (18:41 +0800)]
EX-8282 lfs: migrate compressed file without stripe info

lfs migrate file without specifying stripe info will get layout info
from the file as the target layout template, and
llapi_layout_get_by_xattr() tries to convert LOV_PATTERN_* values
to user scope LLAPI_LAYOUT_* values, while LOV_PATTERN_COMPRESS
is missed in this conversion.

This patch add a function llapi_pattern_from_lov() to handle this
conversion specifically.

This patch also add more error messages for llapi_layout_file_open().

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I49a43cc7761cd2baed7a5da7d4e7cff2152ff9bb
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53039
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
18 months agoEX-7601 osc: remove &pga usage in compress_request
Patrick Farrell [Tue, 21 Nov 2023 17:34:31 +0000 (12:34 -0500)]
EX-7601 osc: remove &pga usage in compress_request

The usage of 'pga' and '&pga' in compress_request is
confusing, but also, compress_request modifies &pga by
allocating a new compressed page array.  Except if we fail
in compress_request, we free that new page array.

This means failing in compress_request replaces 'pga' with
a pointer to freed memory.  Instead, create an explicit
cpga pointer in the caller and use that.  This allows
compress_request to fail safely.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Idaf592103c57b0e9ce76ab520a69b819d4f37be9
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53120
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
18 months agoEX-7601 osc: give compress_request explicit success
Patrick Farrell [Tue, 21 Nov 2023 17:33:06 +0000 (12:33 -0500)]
EX-7601 osc: give compress_request explicit success

Compress_request has explicit failure handling, but the
success handling just follows the failure handling.  This is
confusing - on failure, we do:
page_count = *pcount
then immediately do:
*pcount = page_count

It also sets *orig_pga = pga on success OR failure, which
is wrong because compress_request may have modified pga and
then failed.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I121ec71cfe35babc4a572951e93f7581887ade80
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53119
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
18 months agoEX-7601 osc: rearrange compress_request
Patrick Farrell [Tue, 21 Nov 2023 17:32:20 +0000 (12:32 -0500)]
EX-7601 osc: rearrange compress_request

A trivial rearrangement of compress_request to make it
more readable before redoing the core logic.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I1d34cd2a2a6d84bc30cc7dae8eb07586c4837f7d
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53110
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
18 months agoEX-7601 osc: replace assert with error
Patrick Farrell [Mon, 13 Nov 2023 04:20:01 +0000 (23:20 -0500)]
EX-7601 osc: replace assert with error

We shouldn't assert on values read from storage, instead if
they are incorrect, we should give EIO.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Icda213e3c5a90a848c9b008788e92ee49e2efcb1
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53108
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
18 months agoEX-7601 osc: variable cleanup in decompress_req
Patrick Farrell [Mon, 13 Nov 2023 04:13:10 +0000 (23:13 -0500)]
EX-7601 osc: variable cleanup in decompress_req

Use type and lvl variables in decompress_request.

Remove an unused variable and an assert which can never
fire.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ieff57411a2a41215fd368d731614801bd0f43e38
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53107
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
18 months agoEX-7601 obd: move module load to function
Patrick Farrell [Sat, 11 Nov 2023 20:21:01 +0000 (15:21 -0500)]
EX-7601 obd: move module load to function

This is a trivial code change to make alloc_compr a bit
shorter.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I0a790afe7afebde1d223420d9a578529da6ff7e5
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53102
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
18 months agoEX-7601 ofd: make compress_chunk take chunk_bits
Patrick Farrell [Fri, 10 Nov 2023 22:21:52 +0000 (17:21 -0500)]
EX-7601 ofd: make compress_chunk take chunk_bits

Chunk bits is used everywhere, have compress_chunk convert
to log bits rather than have the callers do it.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ic01bb749425cb95d9c5717965d692a18138ceeb7
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53100
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
18 months agoEX-7601 osc: cleanup compression variables
Patrick Farrell [Fri, 10 Nov 2023 22:19:00 +0000 (17:19 -0500)]
EX-7601 osc: cleanup compression variables

Make usage of the compression variables more readable.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I6daff56b56877c8f36e02303cc0579ba7faa731b
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53099
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
18 months agoEX-7601 osc: rename 'done'
Patrick Farrell [Fri, 10 Nov 2023 22:10:34 +0000 (17:10 -0500)]
EX-7601 osc: rename 'done'

Rename the ambiguous 'done' and remove it where not used.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I8fb88b7a91fcc7dbd5ce2d29a61c18330fc0cda3
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53098
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
18 months agoEX-7601 osc: rename pages_in_chunk
Patrick Farrell [Fri, 10 Nov 2023 22:07:39 +0000 (17:07 -0500)]
EX-7601 osc: rename pages_in_chunk

Use the more standard pages_per_chunk.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I47e0995fe8aa8d1a9a610669d6cd4c39559b6fa4
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53097
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
18 months agoEX-7600 osc: use pages_left in unmerge_chunk
Patrick Farrell [Sun, 12 Nov 2023 19:52:28 +0000 (14:52 -0500)]
EX-7600 osc: use pages_left in unmerge_chunk

Since we have compressed chunks < chunk_size (if they're
after EOF), we must use pages_left in unmgerge_chunk or it
will go off the end of the page array.

This also lets us remove the workaround where unmerge_chunk
would skip pages that were not present.  unmerge_chunk
always works with a known and complete set of pages, so this
check is unneeded.

We should also check that our count of bytes is correct
when we finish.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I88896307990ff839514e54e9a7e18390a457e5d8
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53095
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
18 months agoEX-7601 osc: only set compressed flag on compressed pages
Patrick Farrell [Mon, 13 Nov 2023 16:18:49 +0000 (11:18 -0500)]
EX-7601 osc: only set compressed flag on compressed pages

The code accidentally sets the compressed flag on all
pages processed through fill_cpga, even if they're not
compressed.  Oops.

Also stop setting pg->index on the pages in the compressed
pga, this is only used by encryption and that's no longer
supported with compression.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I313fd943a18b71cd52493852a6884f30d187e52f
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53118
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
18 months agoEX-7601 osc: remove cpga fill bits
Patrick Farrell [Fri, 10 Nov 2023 19:07:07 +0000 (14:07 -0500)]
EX-7601 osc: remove cpga fill bits

cpga fill bits are not needed now that we don't support
compression and encryption.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I13c2278e085e9b288bd896585947e28e2ea505ca
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53082
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
18 months agoEX-7601 ofd: add obd level compression lib
Patrick Farrell [Wed, 1 Nov 2023 21:07:23 +0000 (17:07 -0400)]
EX-7601 ofd: add obd level compression lib

Some compression functions will be used by several areas of
of Lustre, so they need to be in obdclass.

This moves merge_chunk and unmerge_chunk there and adds the
ability for them to merge lnbs.  This is used in a future
patch.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: If4a318119bb7685e41adb9f3b31a66074031e6ac
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52938
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
18 months agoEX-7601 llite: restrict readahead to eof
Patrick Farrell [Tue, 14 Nov 2023 22:58:16 +0000 (17:58 -0500)]
EX-7601 llite: restrict readahead to eof

Compressed file readahead rounding needs to come before
readahead is limited to EOF.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I4e9e7fe63301c08efcb05f170726735593a9431d
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53137
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
18 months agoLU-16032 tests: restore delay_unlink_mb in sanity/360
Andreas Dilger [Thu, 23 Nov 2023 22:56:00 +0000 (15:56 -0700)]
LU-16032 tests: restore delay_unlink_mb in sanity/360

Restore the original value of osd-ldiskfs.*.delay_unlink_mb after
sanity test_360 is finished, so that it doesn't have an impact on
later tests running, in particular sanity-quota.sh was seeing some
delay in freeing quota for files that were just deleted.

Lustre-change: https://review.whamcloud.com/53218
Lustre-commit: TBD (from 8fa0580fd64fe7cbe969817ece87a161c517c4c3)

Test-Parameters: trivial testlist=sanity-quota
Fixes: a772e90243 ("LU-16032 osd: move unlink of large objects to separate thread")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I7c1ab02262afdef2fc51f9fbc3932d954a4f8304
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53219
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
18 months agoLU-15777 hsm: set changelog error for restore layout swap failure
Nikitas Angelinas [Wed, 11 May 2022 22:54:08 +0000 (15:54 -0700)]
LU-15777 hsm: set changelog error for restore layout swap failure

Set the error code in the changelog record generated, if the layout swap
fails at the end of an HSM restore operation. Also, handle error code
overflow inside hsm_set_cl_error(), so that callers don't need to do
this themselves.

Lustre-change: https://review.whamcloud.com/47121
Lustre-commit: 09fe64719b888cd212b6cffe923545b7650f230f

Suggested-by: Olaf Weber <olaf.weber@hpe.com>
Suggested-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Signed-off-by: Nikitas Angelinas <nikitas.angelinas@hpe.com>
Change-Id: I4ed2ebffa3bc1c6a0f87ea9f13734e344f77006f
HPE-bug-id: LUS-10863
Test-Parameters: testlist=sanity-hsm,sanity-pcc
Reviewed-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53213
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
18 months agoLU-17115 quota: fix race of acquiring locks in qmt
Hongchao Zhang [Thu, 26 Oct 2023 12:46:44 +0000 (20:46 +0800)]
LU-17115 quota: fix race of acquiring locks in qmt

In qmt_delete_qid and qmt_reset_qid, the order to require
the lock of lquota_entry and journal is different from that
in qmt_dqacq0, which could cause deadlock in some cases.

Lustre-change: https://review.whamcloud.com/52371
Lustre-commit: ee0e9447e7022e2caa8b161657d505e17ccdc4a1

Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Change-Id: Ic439f2c5d6ca22429422b87f0dde65e0d2e6113d
Reviewed-by: Sergey Cheremencev <scherementsev@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53047
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
18 months agoLU-16097 quota: release preacquired quota when over limits
Hongchao Zhang [Thu, 19 Oct 2023 06:33:47 +0000 (14:33 +0800)]
LU-16097 quota: release preacquired quota when over limits

The pre-acquired quota on each MDT or OST should be released when
the whole quota is over limits, for instance, after the quota limits
had been decreased for some quota ID by Administrator.

Lustre-change: https://review.whamcloud.com/48576
Lustre-commit: 57ac32a22372065b789ca491a568f075e755d339

Test-Parameters: testlist=sanity-quota
Test-Parameters: testlist=sanity-quota
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Change-Id: I6263b835d4ae6a3fd03f9a2bc4f463949cbc74d4
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Sergey Cheremencev <scherementsev@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53070
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
18 months agoLU-17142 mgc: reconnection without pinger
Alexander Boyko [Tue, 22 Aug 2023 09:53:14 +0000 (05:53 -0400)]
LU-17142 mgc: reconnection without pinger

When MGS was offline for some time, AT is increased and
connection request deadline is high. Reconnect with a pinger
waits a request deadline for a next attempt. A situation is
worse with a failover partner, when different connections are used.
Reconnection could fail with local MGS too.

Here is the error when MGC could not connect to a local MGS, MDT
combined with MGS.

    LustreError: 15c-8: MGC90@kfi:
    Confguration from log kjlmo12-MDT0000 failed from MGS -5.

The patch forces reconnection with import invalidate and aborts
inflight requests.

ptlrpc_recover_import() aborts waiting for disconnect import state.
But disconnect happens between connection attempt and it is valid.
This is fixed.

Reset Adaptive Timeout when local MGS starts. It allows MGC to
reconnect efficiently.

mgs_barrier_gl_interpret_reply() should handle -EINVAL from a client,
it means client don't have a lock.

Lustre-change: https://review.whamcloud.com/52498
Lustre-commit: 867ba433e3a0fce4a1b2f8d37a91d550ada41a26

HPE-bug-id: LUS-11633
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: Ie631e04fb3e72900af076cf7f268f20f7b285445
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53116
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
18 months agoRM-620 build: New tag 2.14.0-ddn116
Andreas Dilger [Wed, 22 Nov 2023 21:11:28 +0000 (14:11 -0700)]
RM-620 build: New tag 2.14.0-ddn116

New tag 2.14.0-ddn116

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Iaf3d0d8a468b44c0bd179bc729fc66483cb45581

18 months agoRM-620 build: New tag 2.14.0-2.14.0-ddn116
Andreas Dilger [Wed, 22 Nov 2023 21:10:48 +0000 (14:10 -0700)]
RM-620 build: New tag 2.14.0-2.14.0-ddn116

New tag 2.14.0-2.14.0-ddn116

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I752cf0dfd78de778fe34787b2e026fec0277f610

18 months agoEX-8236 pcc: abort data copy via ll_fid_path_copy
Qian Yingjin [Fri, 10 Nov 2023 09:23:46 +0000 (04:23 -0500)]
EX-8236 pcc: abort data copy via ll_fid_path_copy

For data copying via ll_fid_path_copy in direct I/O mode in user
space, the client calls llapi_pcc_state_fd() to obtain the file
PCC state. If it is marked with PCC_STATE_FL_ATTACH_ABORTING, the
data copy process ll_fid_path_copy exits immediately.
To reduce the overhead of these check, we do not check for each
data copy iter, instead, we do a check for certain times of I/Os
(32 times by default). For I/O size of 32MiB, it will be checking
1 times per second at 1GiB/s. There should be some time-lag
before the copy tool quits finally.

Change-Id: I20631e5481a7e97d7a1ed0729bcd269ef6248a2c
Signed-off-by: Qian Yingjin <qian@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53073
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
18 months agoEX-7331 csdc: prohibit set compression upon encrypted file
Bobi Jam [Fri, 10 Nov 2023 09:17:50 +0000 (17:17 +0800)]
EX-7331 csdc: prohibit set compression upon encrypted file

Setting compression layout component upon encrypted file is not
allowed for now.

This patch add this check on MDS in creating file with layout,
adding/merging new mirror to existing file.

Test-Parameters: testlist=sanity-sec env=ONLY=67,PTLDEBUG=-1
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I60d9f4bfce3a498f1eb3994c6276afb9d89c99a7
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53075
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
18 months agoEX-8584 tests: check and wait lpcc_purge scanning ends
Lei Feng [Fri, 17 Nov 2023 07:53:21 +0000 (15:53 +0800)]
EX-8584 tests: check and wait lpcc_purge scanning ends

check lpcc_purge status to make sure it finishs at least
one round of scanning.

Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: trivial testlist=sanity-pcc env=ONLY="200 201 202",ONLY_REPEAT=50
Change-Id: I8e6f50393d1a3cbb7a1bc976942631db6ecceb67
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53167
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
18 months agoLU-16284 utils: lfs getstripe follows symlink
Lei Feng [Tue, 1 Nov 2022 02:57:39 +0000 (10:57 +0800)]
LU-16284 utils: lfs getstripe follows symlink

'lfs getstripe' prints the information of symlink target by default.
With '--no-follow' option it prints the information of symlink itself.

Lustre-change: https://review.whamcloud.com/49003
Lustre-commit: af32b516593dbf2a8e7a85d885c33fd017926ada

Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: trivial
Change-Id: I6cef01af5bb2235bdcbf0b5c99af4b9ed5869515
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53139
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
18 months agoLU-17275 kernel: RHEL 8.9 client support
Jian Yu [Mon, 20 Nov 2023 22:32:40 +0000 (14:32 -0800)]
LU-17275 kernel: RHEL 8.9 client support

This patch makes changes to support RHEL 8.9 release
with kernel 4.18.0-513.5.1.el8_9 for Lustre client.

Lustre-change: https://review.whamcloud.com/53071
Lustre-commit: TBD (from 0da16c715a06b6426a6b99c111147fc875784e85)

Test-Parameters: trivial mdtcount=4 mdscount=2 \
clientdistro=el8.9 serverdistro=el8.8 testlist=sanity

Test-Parameters: optional clientdistro=el8.9 serverdistro=el8.8 \
testgroup=full-part-1

Test-Parameters: optional clientdistro=el8.9 serverdistro=el8.8 \
testgroup=full-part-2

Test-Parameters: optional clientdistro=el8.9 serverdistro=el8.8 \
testgroup=full-part-3

Change-Id: Ia3672d134534b877bb6aaffb4cea0339bc55974f
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53089
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
18 months agoLU-17293 kernel: update SLES15 SP5 [5.14.21-150500.55.36.1]
Jian Yu [Fri, 17 Nov 2023 18:02:00 +0000 (10:02 -0800)]
LU-17293 kernel: update SLES15 SP5 [5.14.21-150500.55.36.1]

Update SLES15 SP5 kernel to 5.14.21-150500.55.36.1 for Lustre client.

Lustre-change: https://review.whamcloud.com/53156
Lustre-commit: TBD (from 3e50280434d250996dfaa9d68d7da5e2c45d59ef)

Test-Parameters: trivial mdtcount=4 mdscount=2 \
clientdistro=sles15sp5 testlist=sanity

Change-Id: I5a9afb313e9bf315ef4af5b6602785ee68c4c247
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53172
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
18 months agoLU-17274 kernel: new kernel [RHEL 9.3 5.14.0-362.8.1.el9_3]
Jian Yu [Thu, 9 Nov 2023 19:01:19 +0000 (11:01 -0800)]
LU-17274 kernel: new kernel [RHEL 9.3 5.14.0-362.8.1.el9_3]

This patch makes changes to support new RHEL 9.3 release
for Lustre client.

Lustre-change: https://review.whamcloud.com/53054
Lustre-commit: TBD (from 9146471f862d6c6fae6c1f6ac99f55d8280a2891)

Test-Parameters: trivial env=SANITY_EXCEPT="906" \
  mdtcount=4 mdscount=2 clientdistro=el9.3 testlist=sanity
Test-Parameters: optional clientdistro=el9.3 testgroup=full-part-1
Test-Parameters: optional clientdistro=el9.3 testgroup=full-part-2
Test-Parameters: optional clientdistro=el9.3 testgroup=full-part-3

Change-Id: I9cce1a7d2249cb4df39106c44ba4417411ee0757
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53056
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
18 months agoLU-14955 lnet: Use fatal NI if none other available
Serguei Smirnov [Tue, 24 Aug 2021 20:48:41 +0000 (13:48 -0700)]
LU-14955 lnet: Use fatal NI if none other available

Allow NI in fatal state to be selected for sending if there are no
NIs in non-fatal state.

Lustre-change: https://review.whamcloud.com/44746/
Lustre-commit: ff3322fd0c77a8042558711d9f410326d2aa6375

Test-Parameters: trivial testlist=sanity-lnet
HPE-bug-id: LUS-11019
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Iab8ef6ee5c5f45896196dbd88a2f61e004278297
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53153
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
19 months agoRM-620 build: New tag 2.14.0-ddn115
Andreas Dilger [Tue, 14 Nov 2023 22:38:26 +0000 (15:38 -0700)]
RM-620 build: New tag 2.14.0-ddn115

New tag 2.14.0-ddn115

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I8d964022825701d68ab711fb7fd5c22d7c1f6e2b

19 months agoLU-16374 enc: rename O_FILE_ENC to O_CIPHERTEXT
Sebastien Buisson [Sun, 24 Sep 2023 16:07:44 +0000 (12:07 -0400)]
LU-16374 enc: rename O_FILE_ENC to O_CIPHERTEXT

Rename O_FILE_ENC to O_CIPHERTEXT as per discussion in linux-fscrypt
mailing-list.
Also change the flag combination to be:
O_NOCTTY | O_NDELAY | O_DSYNC
to avoid the risk of accidental issues with tar that already opens
files with the 'O_NOCTTY | O_NDELAY' combination.

O_DSYNC does not make much sense for O_RDONLY files, but will force
writes on encrypted restore to be synchronous. With O_DIRECT and large
enough writes (32MB?) that might be OK, but not ideal for small files.

Lustre-Commit: ac522557b1fe3ea2b7275fa6d5df73691b8d06db
Lustre-Change: https://review.whamcloud.com/51640

Fixes: 4869c7a530 ("LU-14677 sec: no encryption key migrate/extend/resync/split")
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I36fed17a413ee690bc445c3e76674ed5fc337de5
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53049
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
19 months agoLU-17184 mgc: remove damaged local configs
Mikhail Pershin [Fri, 13 Oct 2023 21:28:58 +0000 (00:28 +0300)]
LU-17184 mgc: remove damaged local configs

If local config llog is damaged it can't be removed and
prevents target from mounting. This happens because
mgc_llog_local_copy() uses llog_erase() to remove llogs
which can't do the job if llog header is damaged.

Patch changes are:
- llog_erase() to don't initialize header but just destroy
  llog file
- mgc_llog_local_copy() to don't exit on backup to temp
  file but continue with remote llog copying anyway
- conf-sanity test_151 is added to check that target can
  mount with damaged local config

Lustre-change: https://review.whamcloud.com/52697
Lustre-commit: 6a6e4ee20fe5aaad4beab5477e1c7d05e4e702e2

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I637749c38fd5ed03bdac5ca1cd60196f724ab0d1
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53124
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
19 months agoLU-16032 osd: move unlink of large objects to separate thread
Artem Blagodarenko [Fri, 13 Oct 2023 07:49:07 +0000 (15:49 +0800)]
LU-16032 osd: move unlink of large objects to separate thread

Final unlink and freeing of blocks for large objects can lead to
a thread hung with this call stack:

  Net: Service thread pid 1739 was inactive for 200.16s.
  The thread might be hung, or it might only be slow and will
  resume later.
  Dumping the stack trace for debugging purposes:
    __wait_on_buffer+0x2a/0x30
    ldiskfs_wait_block_bitmap+0xe0/0xf0 [ldiskfs]
    ldiskfs_read_block_bitmap+0x31/0x60 [ldiskfs]
    ldiskfs_free_blocks+0x329/0xbb0 [ldiskfs]
    ldiskfs_ext_remove_space+0x8a9/0x1150 [ldiskfs]
    ldiskfs_ext_truncate+0xb0/0xe0 [ldiskfs]
    ldiskfs_truncate+0x3b7/0x3f0 [ldiskfs]
    ldiskfs_evict_inode+0x58a/0x630 [ldiskfs]
    evict+0xb4/0x180
    iput+0xfc/0x190
    osd_object_delete+0x1f8/0x370 [osd_ldiskfs]
    lu_object_free.isra.30+0x68/0x170 [obdclass]
    lu_object_put+0xc5/0x3e0 [obdclass]
    ofd_destroy_by_fid+0x20e/0x500 [ofd]
    ofd_destroy_hdl+0x267/0x9f0 [ofd]
    tgt_request_handle+0xaee/0x15f0 [ptlrpc]
    ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
    ptlrpc_main+0xb34/0x1470 [ptlrpc]
    kthread+0xd1/0xe0

Let's move final unlink to workqueue if inode size > 1GB.  The size
threshold be configured by setting the minimum async truncate size
with the "osd-ldiskfs.*.delay_unlink_mb" parameter.

Writes to "osd-ldiskfs.*.force_sync" parameter will flush pending
delayed unlinks so that space can be reclaimed as needed.

Lustre-change: https://review.whamcloud.com/47995
Lustre-commit: a772e90243ea0ff1de6ae9c67e1f6384c431d200

Change-Id: Id535ae4c58732769effabee42835bc2da8cb5cc1
Signed-off-by: Artem Blagodarenko <ablagodarenko@whamcloud.com>
DDN-bug-id: DDN-3144
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53104
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
19 months agoLU-16827 obdfilter: Fix "emfperf obdfilter-survey" error
Vitaliy Kuznetsov [Fri, 10 Nov 2023 20:35:56 +0000 (21:35 +0100)]
LU-16827 obdfilter: Fix "emfperf obdfilter-survey" error

This patch fixes the definition of the lctl variable. It changes
the logic so that the LCTL value is assigned only when it was
defined earlier.

Lustre-change: https://review.whamcloud.com/53083
Lustre-commit: 95387e580a639eb9ff0648aecf69d0a4951325ef

Test-Parameters: trivial testlist=obdfilter-survey
Signed-off-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Change-Id: I4dfd7e3d1f78208b33b897d8e6680e59b690014c
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53084
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
19 months agoLU-16632 tests: more margin of error for sanity/56xh
Timothy Day [Sat, 11 Mar 2023 22:55:09 +0000 (22:55 +0000)]
LU-16632 tests: more margin of error for sanity/56xh

Give sanity test_56xh more time to migrate files inside the
VMs before failing.

Also, fix a typo.

Lustre-change: https://review.whamcloud.com/50262
Lustre-commit: 36cbba150bce9e2890c8b462ec2ce4af2d6353a5

Test-Parameters: trivial testlist=sanity env=ONLY=56xh,ONLY_REPEAT=100
Fixes: 55968bfabe ("LU-13482 utils: bandwidth limit for lfs migrate")
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: If89c8c3ee113c8a14d4c0463c7bb79e353130c08
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53086
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
19 months agoLU-17258 socklnd: ensure connection type established upon race
Chris Horn [Thu, 2 Nov 2023 19:28:45 +0000 (12:28 -0700)]
LU-17258 socklnd: ensure connection type established upon race

When a connection race is hit between two peers, only increment the
retry count if a connection of the specific type has already been
established; otherwise, this can lead to an unexpected value set in
ksnr_connected and some of the assertions being triggered in
ksocknal_connect():

"ASSERTION( (wanted & ((((1UL))) << (3))) != 0 ) failed"

Lustre-change: https://review.whamcloud.com/52957
Lustre-commit: 5afe3b0538c533c3cca370bc9c0901abccca299a

Fixes: da893c6c97 ("LU-16191 socklnd: limit retries on conns_per_peer mismatch")
HPE-bug-id: LUS-11922
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Signed-off-by: Nikitas Angelinas <nikitas.angelinas@hpe.com>
Change-Id: I6e8abb39ad3c0bcd7fbc8f8c5478c903029df908
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53046
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
19 months agoRM-620 build: New tag 2.14.0-ddn114
Andreas Dilger [Fri, 10 Nov 2023 09:38:19 +0000 (02:38 -0700)]
RM-620 build: New tag 2.14.0-ddn114

New tag 2.14.0-ddn114

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ia94862790d1dec3d8080b6d00445ca163afebf81

19 months agoEX-7601 osc: walk chunk unaligned RPC correctly
Patrick Farrell [Wed, 1 Nov 2023 20:14:12 +0000 (16:14 -0400)]
EX-7601 osc: walk chunk unaligned RPC correctly

For decompression, the client must start looking for
compressed chunks at a chunk aligned offset.

Implement this in decompress_request.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I3273135990ddf51e8b3c651734e19350e91f659c
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52933
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
19 months agoEX-7601 osc: remove unused 'wrkmem'
Patrick Farrell [Fri, 3 Nov 2023 19:56:27 +0000 (15:56 -0400)]
EX-7601 osc: remove unused 'wrkmem'

compress_chunk() takes a wrkmem buffer, which it does not
use.

Remove this and its allocation in compress_request.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I6f236f018f5b79c57cc8725ca0f95125810a4064
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52980
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
19 months agoEX-7601 osc: apply compressed flag to dst page
Patrick Farrell [Fri, 3 Nov 2023 18:11:46 +0000 (14:11 -0400)]
EX-7601 osc: apply compressed flag to dst page

The existing code to apply brw flags to compressed pages
has two issues:
1. The dst_page is NOT an osc async page, it is a bare BRW
page.  This means the brw_page2oap macro isn't right,
because there is no oap page.
Because oap_brw_flags is actually oap_brw_page.flag, we
don't ever access the memory pointed at by OAP, just use it
to find an offset back in to the brw page.

This means the flags are set correctly, but we still
shouldn't use this macro.
2. However, the function then overwrites these flags by
copying from a page in the source, so OBD_BRW_COMPRESSED is
lost.

Add OBD_BRW_COMPRESSED when we set flags.  This ensures the
flag is actually sent to the server on compressed IO.

This was not causing any problems because the server does
not actually use the OBD_BRW_COMPRESSED flag yet.
(EX-7601 uses this flag)

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ia94cdc803868ce16a0b66fd58578ec8b2d00cbae
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52979
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
19 months agoEX-8270 sptlrpc: don't crash for too-large chunk size
Andreas Dilger [Thu, 9 Nov 2023 00:10:05 +0000 (17:10 -0700)]
EX-8270 sptlrpc: don't crash for too-large chunk size

If the chunk size is too large, don't fall off the
end of the page_pool[] array with a large "order".

Test-Parameters: trivial
Fixes: d945f1b064 ("EX-6261 ptlrpc: extend sec bulk functionality")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I192ac1b227f1cab8405f6657e754101d353ebbe5
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53044
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
19 months agoEX-7806 csdc: not support data compression on MDT
Bobi Jam [Wed, 23 Aug 2023 16:43:56 +0000 (00:43 +0800)]
EX-7806 csdc: not support data compression on MDT

Do not support setting data compression component on DoM until
data compression on MDT implemented.

Test-Parameters: trivial testlist=sanity-pfl
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I3794460140f08a073377c418dd56e7dda907d96d
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52062
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
19 months agoEX-7601 csdc: improve preview warning messages
Andreas Dilger [Thu, 9 Nov 2023 00:29:26 +0000 (17:29 -0700)]
EX-7601 csdc: improve preview warning messages

Avoid printing duplicate warning messages on the console when
creating files with multiple compressed components.  On the
flip side, log a console message when compression is enabled
so that this will later be visible if enabled on a system.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I8cb2f67689824513335f3fa65e9ea751923ebbe5
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53045
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
19 months agoLU-17205 utils: add lctl get_param -H option
Aurelien Degremont [Tue, 17 Oct 2023 13:07:45 +0000 (15:07 +0200)]
LU-17205 utils: add lctl get_param -H option

- Add a new '-H' option to 'lctl get_param' that will prefix
each output line with the parameter name instead of only
the first line by default.

That makes grepping lctl get_param with wildcards much easier
as you can now easily know which parameter returns which value.

  $ lctl get_param -H osc.*.state | grep current
  osc.lustre-OST0000-osc-ff1148c0.state=current_state: FULL
  osc.lustre-OST0001-osc-ff1248c0.state=current_state: DISCONN
  osc.lustre-OST0002-osc-ff1348c0.state=current_state: FULL
  osc.lustre-OST0003-osc-ff1448c0.state=current_state: FULL
  osc.lustre-OST0004-osc-ff1548c0.state=current_state: FULL

It also prints an output line even for empty values. That also
makes like easier for admins.

- The patch also removes the force line feed if the parameter
value was larger than 80 chars. This was considered a misfeature
and is now drop for all usages, with or without -H.

Lustre-change: https://review.whamcloud.com/52730
Lustre-commit: a12c352a3dd8d424b1da09efc6884530c60d105b

Test-Parameters: trivial
Signed-off-by: Aurelien Degremont <adegremont@nvidia.com>
Change-Id: Ib1fa0dc400db4c19fed10ad4cced9be5668418e3
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53067
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
19 months agoLU-16639 misc: cleanup concole messages
Andreas Dilger [Mon, 13 Mar 2023 22:08:30 +0000 (16:08 -0600)]
LU-16639 misc: cleanup concole messages

The lprocfs_job_cleanup() was not properly dropping all jobstats
from the hash table and printing errors from job_stat_exit() at
unmount.  Ensure all stats are "old enough" when @clear is set.

Change early libcfs cfs_cpu_init() messages from CERROR() to
pr_err() to avoid circular dependencies on libcfs setup before
printing an error message to the console during module init.

Lustre-change: https://review.whamcloud.com/50283
Lustre-commit: 8f40a3d7110da1af8e310a4b7f40b86f13080938

Test-Parameters: trivial
Fixes: ea2cd3af7b ("LU-11407 obdclass: add start time to stats files")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ide3f502103392a79419cc1836200bf5a1a3ebbe5
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Feng Lei <flei@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53063
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
19 months agoLU-17251 tests: use stderr in precreated_ost_obj_count()
Andreas Dilger [Thu, 9 Nov 2023 23:28:45 +0000 (16:28 -0700)]
LU-17251 tests: use stderr in precreated_ost_obj_count()

Write the status output to stderr instead of stdout, so that
it doesn't confuse the caller that is expecting the number
of objects precreated in stdout.

Test-Parameters: trivial testlist=sanity-scrub,sanity-lfsck
Fixes: c39bdce94f ("LU-17251 test: improve parallel-scale rr_alloc test")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ib9b132a04a88b15cea34872954bfa5c4ddf8cde7
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53062
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
19 months agoRM-620 build: New tag 2.14.0-ddn113
Andreas Dilger [Thu, 9 Nov 2023 09:38:47 +0000 (02:38 -0700)]
RM-620 build: New tag 2.14.0-ddn113

New tag 2.14.0-ddn113

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I73eab3dc06a0488b7e68c7434cb8a6af2c590a2f

19 months agoRM-620 build: New tag lipe-2.36
Andreas Dilger [Thu, 9 Nov 2023 09:38:11 +0000 (02:38 -0700)]
RM-620 build: New tag lipe-2.36

New tag lipe-2.36

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I6c986a17d42f4bd95009d9d0f03acc601c9ee2dd

19 months agoEX-7601 tests: improve/skip sanity test_460a
Andreas Dilger [Wed, 8 Nov 2023 22:39:28 +0000 (15:39 -0700)]
EX-7601 tests: improve/skip sanity test_460a

Skip sanity test_460a for el9.2 clients, since they appear to be
failing that test regularly, but no other distro client is.
Improve the log messages to see what stage is currently running.
Limit the "cmp --verbose" messages to one chunk, otherwise it
may print the entire 14MB test file (about 80 MiB of ASCII).

Move enable_compression() and disable_compression() functions
into test-framework.sh so that they can be used for all tests.

Set LFS_SETSTRIPE_COMPR_OK=y in enable_compression() since we
already know this is a preview and don't need it printed.

Allow sanity-compr.sh to specify SANITY_ONLY and/or SANITYN_ONLY,
and skip the other test script run if only one of them is set.

Test-Parameters: trivial
Test-Parameters: testlist=sanity env=ONLY=460,HONOR_EXCEPT=y clientdistro=el9.2
Test-Parameters: testlist=sanity-compr env=SANITY_ONLY=460 clientdistro=ubuntu2204
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I8cb2f67689824513335f3fa65e9ea7519e3ebbe5
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53043
Tested-by: jenkins <devops@whamcloud.com>
19 months agoEX-8570 lipe: add lpcc sub command to trigger purge scan
Lei Feng [Wed, 8 Nov 2023 08:01:05 +0000 (16:01 +0800)]
EX-8570 lipe: add lpcc sub command to trigger purge scan

Add a sub command 'lpcc purge-scan' to trigger purge
scanning by sending SIGUSR2 to matching lpcc_purge
process.

Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: trivial
Change-Id: I976621fe787daa15b8206eed97efdebe75cd7425
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53036
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
19 months agoEX-8569 lipe: trigger lpcc_purge scan by SIGUSR2
Lei Feng [Wed, 8 Nov 2023 04:46:45 +0000 (12:46 +0800)]
EX-8569 lipe: trigger lpcc_purge scan by SIGUSR2

send SIGUSR2 to lpcc_purge to trigger a scanning
immediately.

Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: trivial
Change-Id: I2811c90ac75c93167e8104e90b424ac31c8cc50c
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53034
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
19 months agoEX-8568 lipe: lpcc_purge can disable force scanning
Lei Feng [Wed, 8 Nov 2023 02:07:32 +0000 (10:07 +0800)]
EX-8568 lipe: lpcc_purge can disable force scanning

when force_scan_interval is set to -1, lpcc_purge will never
start force scanning.

Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: trivial
Change-Id: I21bcadb97f09622eae08af73082196e816b2c9ae
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53032
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
19 months agoEX-4125 lipe: adjust atime in lpcc_purge
Lei Feng [Mon, 6 Nov 2023 06:40:13 +0000 (14:40 +0800)]
EX-4125 lipe: adjust atime in lpcc_purge

Some time atime < mtime. In this case, adjust atime with mtime.

Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: trivial testlist=sanity-pcc env=ONLY="200-203"
Change-Id: I35b3da543b57265b09ef65f4e810761aa727f483
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53002
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
19 months agoEX-8551 lipe: build arch-specific lipe-lpcc package
Lei Feng [Tue, 7 Nov 2023 04:06:24 +0000 (12:06 +0800)]
EX-8551 lipe: build arch-specific lipe-lpcc package

lpcc_purge in lipe-lpcc package is an exec binary.
So need arch-specific lipe-lpcc package.

Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: trivial
Change-Id: I0387e258eaec6e39156f823d3a38b5dc3fb9a4cd
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53007
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Raphael Druon <rdruon@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
19 months agoLU-15576 osp: Interop skip sanity test 823 for MDS < 2.14.0-ddn112
Shaun Tancheff [Tue, 22 Feb 2022 07:28:50 +0000 (01:28 -0600)]
LU-15576 osp: Interop skip sanity test 823 for MDS < 2.14.0-ddn112

Prior to v2_14_55-29-g06e586016d setting create_count greater
than the maximum returned -ERANGE.

During interop testing skip sanity/823 for MDS older than 2.14.0-ddn112.

Lustre-change: https://review.whamcloud.com/46567
Lustre-commit: 5da859e262dd5e93bfeb2bfa1366a9e20395d3f4

Test-Parameters: trivial serverversion=2.14.0 testlist=sanity env=ONLY=823
Fixes: 06e586016d3a ("LU-13941 osp: Silently lower requested create_count to maximum")
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Ie79617deea047b2a846f696473b9c2b5681953be
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53022
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
19 months agoLU-10465 osd-ldiskfs: 8MiB IOs should bypass cache
Andreas Dilger [Fri, 3 Nov 2023 23:49:29 +0000 (17:49 -0600)]
LU-10465 osd-ldiskfs: 8MiB IOs should bypass cache

Changes the writethrough_max_io_mb and readcache_max_io_mb
params to check for IO size >= max_io_mb instead of > max_io_mb
when deciding to bypass cache.

Read/write IOs that are 8MiB in size should bypass the pagecache
on the OSTs, rather than requiring IOs that are slightly larger
than this.  8MiB is enough to submit 1MiB to each HDD spindle in
an 8+2 RAID6, and caching these writes on the OSS is not helping.

Lustre-change: https://review.whamcloud.com/52989
Lustre-commit: TBD (from dcdc4748f1443981a170bc2945b178226e64a6d4)

Test-Parameters: trivial
Fixes: 3043c6f189 ("LU-12071 osd-ldiskfs: bypass pagecache if requested")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Iae435f5b99e2e8bc6a9458fedad65a81c2853350
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53033
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>