Whamcloud - gitweb
Patrick Farrell [Sat, 28 Oct 2023 17:48:04 +0000 (13:48 -0400)]
EX-7601 ofd: rename 'local' in thread_big_cache
It's not a big deal since it's only used a few times, but
let's give this variable a descriptive name.
Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ide136cd42e885d59f1a2e4ce22a2e7449faca3f9
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52874
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Patrick Farrell [Sat, 28 Oct 2023 20:30:56 +0000 (16:30 -0400)]
EX-7601 ofd: do not overwrite rc in unmerge_chunk
unmerge_chunk should not be responsible for setting the
lnb rc, because this overwrites the result of any previous
activity on the lnb. Plus, unmerge_chunk can't fail.
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Id1ce590c7f1da3ab7faddbd685d264a33c08d639
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52876
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Patrick Farrell [Thu, 16 Nov 2023 23:26:26 +0000 (18:26 -0500)]
EX-7601 osc: allow multiple chunks in read
It's rare, but reads can sometimes have multiple
discontiguous chunks. Update decompress_request to
handle this case.
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I880af95db285dce76db3610e8140a0f54baa401b
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53159
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Patrick Farrell [Thu, 16 Nov 2023 20:37:57 +0000 (15:37 -0500)]
EX-7601 osc: rename pages_in_chunk
Chunks can have variable numbers of pages in them.
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: If199d777367569e62c21305f6e4b9f3e4cce6d06
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53158
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Patrick Farrell [Sat, 11 Nov 2023 20:39:22 +0000 (15:39 -0500)]
EX-7601 obd: move type switching to alloc_compr callers
The code is much cleaner if we can eliminated applied type
and handle that issue once per compression or decompression
rather than for every chunk. This requires moving the type
switching inside alloc_compr. (Also improve some error
messages - alloc_compr can fail with ENOMEM as well.)
The compression code currently allocates a transform for
every chunk on the client. This is relatively cheap, but
it also complicates the code by repeatedly checking if a
particular compression type is supported (this is the
"applied type" code).
Moving alloc_compr to compress/decompress request makes the
code much simpler.
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I162e81577db721a9715d57b3f262fcabbcbf308a
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53103
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Alexandre Ioffe [Sun, 26 Nov 2023 19:42:14 +0000 (11:42 -0800)]
EX-8590 lipe: Use only one client for the test
Use only one client machine for hot-pools tests 75a, b, c.
Test-Parameters: trivial testlist=hot-pools
Test-Parameters: trivial testlist=hot-pools env=ONLY=75a
Test-Parameters: trivial testlist=hot-pools env=ONLY=75b
Test-Parameters: trivial testlist=hot-pools env=ONLY=75c
Test-Parameters: trivial testlist=hot-pools env=ONLY=75a
Test-Parameters: trivial testlist=hot-pools env=ONLY=75b
Test-Parameters: trivial testlist=hot-pools env=ONLY=75c
Signed-off-by: Alexandre Ioffe <aioffe@ddn.com>
Change-Id: Icfa958474ec928faeec63029a2d5983cea650bb7
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53240
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Andreas Dilger [Sat, 25 Nov 2023 05:27:36 +0000 (22:27 -0700)]
EX-8466 tests: limit 'cmp' output in sanity-pcc.sh
Limit the number of lines printed by 'cmp' when there is an error
comparing two files. Often the files are multiple MB in size, and
printing 1-32M lines of output when the test fails is not useful.
Instead, print the first 66000 lines of output by default, which is
enough to see a full 64KiB plus some lines to see if more than 64KiB
of data is incorrect. This is controlled by the CMP_LINES variable.
Test-Parameters: trivial testlist=sanity-pcc
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I80f4d5d3460d531ab63788185a2c88e79415a801
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53239
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Andreas Dilger [Fri, 24 Nov 2023 07:48:44 +0000 (00:48 -0700)]
LU-17312 tests: skip conf-sanity test_53 in interop
Skip conf-sanity test_53 in interop because older servers cannot
stop any running service threads above threads_max.
Remove old test interop for servers < 2.3.
Lustre-change: https://review.whamcloud.com/53226
Lustre-commit: TBD (from
d029a1cb45ac440e580c177866f0e9766444d8f1)
Test-Parameters: trivial testlist=conf-sanity
Test-Parameters: testlist=conf-sanity env=ONLY=53 serverversion=EXA5
Fixes:
183cb1e3cd ("LU-947 ptlrpc: allow stopping threads above threads_max")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ia95405060c607c7a070720ed32a7a43b1c3ebbe5
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53227
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Artem Blagodarenko [Thu, 2 Nov 2023 22:20:52 +0000 (22:20 +0000)]
EX-7600 osd: save compressed object size on zfs
"osc: save compressed object size" added means to transfer
object size to the osd and added ldiskfs support.
This patch adds saving objec size to the ZFS backend.
Currently this fix submitted as separete patch, for
testing purpouse, but can be marged to the main patch latter.
Signed-off-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Test-Parameters: trivial testlist=sanity fstype=zfs
Change-Id: I99e29e3f756a070b5f3cece12c4ca58f668a2ecf
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52958
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Andreas Dilger [Thu, 23 Nov 2023 01:29:08 +0000 (18:29 -0700)]
EX-8671 tests: use smaller files in sanity-pcc/103+104
Running fallocate is fast, but the actual PCC data copy may be slow.
Use smaller test files for sanity-pcc test_103 and test_104 to speed
up testing, and also wait longer in case the copy is slow.
Add some extra debugging on failure so we can see the file attach
state on failure, in case there is something wrong with the parsing.
Test-Parameters: trivial testlist=sanity-pcc
Test-Parameters: testlist=sanity-pcc env=ONLY=103,ONLY_REPEAT=100
Test-Parameters: testlist=sanity-pcc env=ONLY=104,ONLY_REPEAT=100
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I09f159810a778b8ef2bab93d0e2869237a3ebbe5
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53212
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Feng Lei <flei@whamcloud.com>
Patrick Farrell [Wed, 18 Oct 2023 17:40:51 +0000 (13:40 -0400)]
EX-7601 osd: osd_bufs_put does not always handle all pages
osd_bufs_put asserts that the dio pages used after are
always zero, but there's no reason for this to be true and
compression specifically violates this by using 1 page at
a time.
Without this patch, we hit this assert and crash when
nonrotational = 1.
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: If6bdb11f254c260e2da4cabe11a82693a468e6fb
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52750
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Andreas Dilger [Sun, 26 Nov 2023 05:58:09 +0000 (22:58 -0700)]
LU-17251 osp: start OST object precreate earlier
If the OST object precreate count gets large (usually due to high
MDT file create workload, but sometimes also forced during testing)
then send an OST_CREATE RPC sooner when the number of precreated
objects gets low.
Currently the MDS will wait until 1/2 of the precreated OST objects
are consumed, but if create_count = 10000, then this can put bursty
create workloads on the OST. Instead, send an OST_CREATE RPC when
the precreate pool is at most 1024 objects below target, so that the
MDS keeps its precreated pool more full and the OST doesn't have to
create so many objects at once (which also locks object directories
for a longer time).
Don't set opd_force_creation=true when osp.*.create_count is set
larger, and instead rely on the improved precreate check to force
OST object creation to start sooner, as opd_force_creation=true
can cause the OSP precreation to stop completely in some cases.
Lustre-change: https://review.whamcloud.com/53245
Lustre-commit: TBD (from
6ffb849d7086a2b2ae48f274d4f5b1b8fbf83fe2)
Test-Parameters: testlist=sanity env=ONLY=1-130,HONOR_EXCEPT=y
Test-Parameters: testlist=sanity env=ONLY=1-130,HONOR_EXCEPT=y
Test-Parameters: testlist=sanity env=ONLY=1-130,HONOR_EXCEPT=y
Test-Parameters: testlist=sanity env=ONLY=1-130,HONOR_EXCEPT=y
Test-Parameters: testlist=parallel-scale env=ONLY=rr_alloc,ONLY_REPEAT=10
Test-Parameters: testlist=parallel-scale env=ONLY=rr_alloc,ONLY_REPEAT=10
Test-Parameters: testlist=parallel-scale env=ONLY=rr_alloc,ONLY_REPEAT=10
Test-Parameters: testlist=parallel-scale env=ONLY=rr_alloc,ONLY_REPEAT=10
Fixes:
df5b4c0a8b ("LU-17251 osp: force precreate if create_count grows")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Id2d12636d535485919ca5eec3adb18b1e6ce7057
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53244
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Andreas Dilger [Fri, 24 Nov 2023 09:35:29 +0000 (02:35 -0700)]
RM-620 build: New tag 2.14.0-ddn117
New tag 2.14.0-ddn117
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ida37fd66ddfd7331efbc3a2276ddaf0f574f5de5
Bobi Jam [Fri, 13 Jan 2023 04:36:01 +0000 (12:36 +0800)]
LU-16468 llite: protect layout before read IO going
It's possible that the before the read IO, file_read_confine_iter()
->lov_attr_get() to get proper kms (known minimum size of the file),
and lov_attr_get() presumes that it's called under ongoing IO, which
protected the layout from changing, while it's not in this case.
Lustre-change: https://review.whamcloud.com/49622
Lustre-commit: from
e050b91c6c471d3576eba3bbf4f3c31aef644e3f
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I1b36ec6e158331e63e8026ee2b986d5a7e3cb6dc
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49623
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Nathaniel Clark [Wed, 15 Nov 2023 21:09:10 +0000 (16:09 -0500)]
EX-8386 lipe: Remove cruft from systemd services
Remove After=rust-iml-agent.service
rust-iml-agent is deprecrecated and not longer needed.
Change-Id: Icd0e79dbd417e98beb07f8546487d20fa5f6bb62
Signed-off-by: Nathaniel Clark <nclark@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53152
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Feng Lei <flei@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Bobi Jam [Wed, 8 Nov 2023 10:41:21 +0000 (18:41 +0800)]
EX-8282 lfs: migrate compressed file without stripe info
lfs migrate file without specifying stripe info will get layout info
from the file as the target layout template, and
llapi_layout_get_by_xattr() tries to convert LOV_PATTERN_* values
to user scope LLAPI_LAYOUT_* values, while LOV_PATTERN_COMPRESS
is missed in this conversion.
This patch add a function llapi_pattern_from_lov() to handle this
conversion specifically.
This patch also add more error messages for llapi_layout_file_open().
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I49a43cc7761cd2baed7a5da7d4e7cff2152ff9bb
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53039
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Patrick Farrell [Tue, 21 Nov 2023 17:34:31 +0000 (12:34 -0500)]
EX-7601 osc: remove &pga usage in compress_request
The usage of 'pga' and '&pga' in compress_request is
confusing, but also, compress_request modifies &pga by
allocating a new compressed page array. Except if we fail
in compress_request, we free that new page array.
This means failing in compress_request replaces 'pga' with
a pointer to freed memory. Instead, create an explicit
cpga pointer in the caller and use that. This allows
compress_request to fail safely.
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Idaf592103c57b0e9ce76ab520a69b819d4f37be9
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53120
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Patrick Farrell [Tue, 21 Nov 2023 17:33:06 +0000 (12:33 -0500)]
EX-7601 osc: give compress_request explicit success
Compress_request has explicit failure handling, but the
success handling just follows the failure handling. This is
confusing - on failure, we do:
page_count = *pcount
then immediately do:
*pcount = page_count
It also sets *orig_pga = pga on success OR failure, which
is wrong because compress_request may have modified pga and
then failed.
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I121ec71cfe35babc4a572951e93f7581887ade80
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53119
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Patrick Farrell [Tue, 21 Nov 2023 17:32:20 +0000 (12:32 -0500)]
EX-7601 osc: rearrange compress_request
A trivial rearrangement of compress_request to make it
more readable before redoing the core logic.
Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I1d34cd2a2a6d84bc30cc7dae8eb07586c4837f7d
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53110
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Patrick Farrell [Mon, 13 Nov 2023 04:20:01 +0000 (23:20 -0500)]
EX-7601 osc: replace assert with error
We shouldn't assert on values read from storage, instead if
they are incorrect, we should give EIO.
Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Icda213e3c5a90a848c9b008788e92ee49e2efcb1
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53108
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Patrick Farrell [Mon, 13 Nov 2023 04:13:10 +0000 (23:13 -0500)]
EX-7601 osc: variable cleanup in decompress_req
Use type and lvl variables in decompress_request.
Remove an unused variable and an assert which can never
fire.
Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ieff57411a2a41215fd368d731614801bd0f43e38
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53107
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Patrick Farrell [Sat, 11 Nov 2023 20:21:01 +0000 (15:21 -0500)]
EX-7601 obd: move module load to function
This is a trivial code change to make alloc_compr a bit
shorter.
Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I0a790afe7afebde1d223420d9a578529da6ff7e5
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53102
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Patrick Farrell [Fri, 10 Nov 2023 22:21:52 +0000 (17:21 -0500)]
EX-7601 ofd: make compress_chunk take chunk_bits
Chunk bits is used everywhere, have compress_chunk convert
to log bits rather than have the callers do it.
Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ic01bb749425cb95d9c5717965d692a18138ceeb7
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53100
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Patrick Farrell [Fri, 10 Nov 2023 22:19:00 +0000 (17:19 -0500)]
EX-7601 osc: cleanup compression variables
Make usage of the compression variables more readable.
Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I6daff56b56877c8f36e02303cc0579ba7faa731b
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53099
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Patrick Farrell [Fri, 10 Nov 2023 22:10:34 +0000 (17:10 -0500)]
EX-7601 osc: rename 'done'
Rename the ambiguous 'done' and remove it where not used.
Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I8fb88b7a91fcc7dbd5ce2d29a61c18330fc0cda3
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53098
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Patrick Farrell [Fri, 10 Nov 2023 22:07:39 +0000 (17:07 -0500)]
EX-7601 osc: rename pages_in_chunk
Use the more standard pages_per_chunk.
Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I47e0995fe8aa8d1a9a610669d6cd4c39559b6fa4
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53097
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Patrick Farrell [Sun, 12 Nov 2023 19:52:28 +0000 (14:52 -0500)]
EX-7600 osc: use pages_left in unmerge_chunk
Since we have compressed chunks < chunk_size (if they're
after EOF), we must use pages_left in unmgerge_chunk or it
will go off the end of the page array.
This also lets us remove the workaround where unmerge_chunk
would skip pages that were not present. unmerge_chunk
always works with a known and complete set of pages, so this
check is unneeded.
We should also check that our count of bytes is correct
when we finish.
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I88896307990ff839514e54e9a7e18390a457e5d8
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53095
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Patrick Farrell [Mon, 13 Nov 2023 16:18:49 +0000 (11:18 -0500)]
EX-7601 osc: only set compressed flag on compressed pages
The code accidentally sets the compressed flag on all
pages processed through fill_cpga, even if they're not
compressed. Oops.
Also stop setting pg->index on the pages in the compressed
pga, this is only used by encryption and that's no longer
supported with compression.
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I313fd943a18b71cd52493852a6884f30d187e52f
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53118
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Patrick Farrell [Fri, 10 Nov 2023 19:07:07 +0000 (14:07 -0500)]
EX-7601 osc: remove cpga fill bits
cpga fill bits are not needed now that we don't support
compression and encryption.
Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I13c2278e085e9b288bd896585947e28e2ea505ca
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53082
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Patrick Farrell [Wed, 1 Nov 2023 21:07:23 +0000 (17:07 -0400)]
EX-7601 ofd: add obd level compression lib
Some compression functions will be used by several areas of
of Lustre, so they need to be in obdclass.
This moves merge_chunk and unmerge_chunk there and adds the
ability for them to merge lnbs. This is used in a future
patch.
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: If4a318119bb7685e41adb9f3b31a66074031e6ac
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52938
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Patrick Farrell [Tue, 14 Nov 2023 22:58:16 +0000 (17:58 -0500)]
EX-7601 llite: restrict readahead to eof
Compressed file readahead rounding needs to come before
readahead is limited to EOF.
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I4e9e7fe63301c08efcb05f170726735593a9431d
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53137
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Andreas Dilger [Thu, 23 Nov 2023 22:56:00 +0000 (15:56 -0700)]
LU-16032 tests: restore delay_unlink_mb in sanity/360
Restore the original value of osd-ldiskfs.*.delay_unlink_mb after
sanity test_360 is finished, so that it doesn't have an impact on
later tests running, in particular sanity-quota.sh was seeing some
delay in freeing quota for files that were just deleted.
Lustre-change: https://review.whamcloud.com/53218
Lustre-commit: TBD (from
8fa0580fd64fe7cbe969817ece87a161c517c4c3)
Test-Parameters: trivial testlist=sanity-quota
Fixes:
a772e90243 ("LU-16032 osd: move unlink of large objects to separate thread")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I7c1ab02262afdef2fc51f9fbc3932d954a4f8304
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53219
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Nikitas Angelinas [Wed, 11 May 2022 22:54:08 +0000 (15:54 -0700)]
LU-15777 hsm: set changelog error for restore layout swap failure
Set the error code in the changelog record generated, if the layout swap
fails at the end of an HSM restore operation. Also, handle error code
overflow inside hsm_set_cl_error(), so that callers don't need to do
this themselves.
Lustre-change: https://review.whamcloud.com/47121
Lustre-commit:
09fe64719b888cd212b6cffe923545b7650f230f
Suggested-by: Olaf Weber <olaf.weber@hpe.com>
Suggested-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Signed-off-by: Nikitas Angelinas <nikitas.angelinas@hpe.com>
Change-Id: I4ed2ebffa3bc1c6a0f87ea9f13734e344f77006f
HPE-bug-id: LUS-10863
Test-Parameters: testlist=sanity-hsm,sanity-pcc
Reviewed-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53213
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Hongchao Zhang [Thu, 26 Oct 2023 12:46:44 +0000 (20:46 +0800)]
LU-17115 quota: fix race of acquiring locks in qmt
In qmt_delete_qid and qmt_reset_qid, the order to require
the lock of lquota_entry and journal is different from that
in qmt_dqacq0, which could cause deadlock in some cases.
Lustre-change: https://review.whamcloud.com/52371
Lustre-commit:
ee0e9447e7022e2caa8b161657d505e17ccdc4a1
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Change-Id: Ic439f2c5d6ca22429422b87f0dde65e0d2e6113d
Reviewed-by: Sergey Cheremencev <scherementsev@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53047
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Hongchao Zhang [Thu, 19 Oct 2023 06:33:47 +0000 (14:33 +0800)]
LU-16097 quota: release preacquired quota when over limits
The pre-acquired quota on each MDT or OST should be released when
the whole quota is over limits, for instance, after the quota limits
had been decreased for some quota ID by Administrator.
Lustre-change: https://review.whamcloud.com/48576
Lustre-commit:
57ac32a22372065b789ca491a568f075e755d339
Test-Parameters: testlist=sanity-quota
Test-Parameters: testlist=sanity-quota
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Change-Id: I6263b835d4ae6a3fd03f9a2bc4f463949cbc74d4
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Sergey Cheremencev <scherementsev@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53070
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Alexander Boyko [Tue, 22 Aug 2023 09:53:14 +0000 (05:53 -0400)]
LU-17142 mgc: reconnection without pinger
When MGS was offline for some time, AT is increased and
connection request deadline is high. Reconnect with a pinger
waits a request deadline for a next attempt. A situation is
worse with a failover partner, when different connections are used.
Reconnection could fail with local MGS too.
Here is the error when MGC could not connect to a local MGS, MDT
combined with MGS.
LustreError: 15c-8: MGC90@kfi:
Confguration from log kjlmo12-MDT0000 failed from MGS -5.
The patch forces reconnection with import invalidate and aborts
inflight requests.
ptlrpc_recover_import() aborts waiting for disconnect import state.
But disconnect happens between connection attempt and it is valid.
This is fixed.
Reset Adaptive Timeout when local MGS starts. It allows MGC to
reconnect efficiently.
mgs_barrier_gl_interpret_reply() should handle -EINVAL from a client,
it means client don't have a lock.
Lustre-change: https://review.whamcloud.com/52498
Lustre-commit:
867ba433e3a0fce4a1b2f8d37a91d550ada41a26
HPE-bug-id: LUS-11633
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: Ie631e04fb3e72900af076cf7f268f20f7b285445
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53116
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Andreas Dilger [Wed, 22 Nov 2023 21:11:28 +0000 (14:11 -0700)]
RM-620 build: New tag 2.14.0-ddn116
New tag 2.14.0-ddn116
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Iaf3d0d8a468b44c0bd179bc729fc66483cb45581
Andreas Dilger [Wed, 22 Nov 2023 21:10:48 +0000 (14:10 -0700)]
RM-620 build: New tag 2.14.0-2.14.0-ddn116
New tag 2.14.0-2.14.0-ddn116
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I752cf0dfd78de778fe34787b2e026fec0277f610
Qian Yingjin [Fri, 10 Nov 2023 09:23:46 +0000 (04:23 -0500)]
EX-8236 pcc: abort data copy via ll_fid_path_copy
For data copying via ll_fid_path_copy in direct I/O mode in user
space, the client calls llapi_pcc_state_fd() to obtain the file
PCC state. If it is marked with PCC_STATE_FL_ATTACH_ABORTING, the
data copy process ll_fid_path_copy exits immediately.
To reduce the overhead of these check, we do not check for each
data copy iter, instead, we do a check for certain times of I/Os
(32 times by default). For I/O size of 32MiB, it will be checking
1 times per second at 1GiB/s. There should be some time-lag
before the copy tool quits finally.
Change-Id: I20631e5481a7e97d7a1ed0729bcd269ef6248a2c
Signed-off-by: Qian Yingjin <qian@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53073
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Bobi Jam [Fri, 10 Nov 2023 09:17:50 +0000 (17:17 +0800)]
EX-7331 csdc: prohibit set compression upon encrypted file
Setting compression layout component upon encrypted file is not
allowed for now.
This patch add this check on MDS in creating file with layout,
adding/merging new mirror to existing file.
Test-Parameters: testlist=sanity-sec env=ONLY=67,PTLDEBUG=-1
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I60d9f4bfce3a498f1eb3994c6276afb9d89c99a7
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53075
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Lei Feng [Fri, 17 Nov 2023 07:53:21 +0000 (15:53 +0800)]
EX-8584 tests: check and wait lpcc_purge scanning ends
check lpcc_purge status to make sure it finishs at least
one round of scanning.
Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: trivial testlist=sanity-pcc env=ONLY="200 201 202",ONLY_REPEAT=50
Change-Id: I8e6f50393d1a3cbb7a1bc976942631db6ecceb67
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53167
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Lei Feng [Tue, 1 Nov 2022 02:57:39 +0000 (10:57 +0800)]
LU-16284 utils: lfs getstripe follows symlink
'lfs getstripe' prints the information of symlink target by default.
With '--no-follow' option it prints the information of symlink itself.
Lustre-change: https://review.whamcloud.com/49003
Lustre-commit:
af32b516593dbf2a8e7a85d885c33fd017926ada
Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: trivial
Change-Id: I6cef01af5bb2235bdcbf0b5c99af4b9ed5869515
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53139
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Jian Yu [Mon, 20 Nov 2023 22:32:40 +0000 (14:32 -0800)]
LU-17275 kernel: RHEL 8.9 client support
This patch makes changes to support RHEL 8.9 release
with kernel 4.18.0-513.5.1.el8_9 for Lustre client.
Lustre-change: https://review.whamcloud.com/53071
Lustre-commit: TBD (from
0da16c715a06b6426a6b99c111147fc875784e85)
Test-Parameters: trivial mdtcount=4 mdscount=2 \
clientdistro=el8.9 serverdistro=el8.8 testlist=sanity
Test-Parameters: optional clientdistro=el8.9 serverdistro=el8.8 \
testgroup=full-part-1
Test-Parameters: optional clientdistro=el8.9 serverdistro=el8.8 \
testgroup=full-part-2
Test-Parameters: optional clientdistro=el8.9 serverdistro=el8.8 \
testgroup=full-part-3
Change-Id: Ia3672d134534b877bb6aaffb4cea0339bc55974f
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53089
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Jian Yu [Fri, 17 Nov 2023 18:02:00 +0000 (10:02 -0800)]
LU-17293 kernel: update SLES15 SP5 [5.14.21-150500.55.36.1]
Update SLES15 SP5 kernel to 5.14.21-150500.55.36.1 for Lustre client.
Lustre-change: https://review.whamcloud.com/53156
Lustre-commit: TBD (from
3e50280434d250996dfaa9d68d7da5e2c45d59ef)
Test-Parameters: trivial mdtcount=4 mdscount=2 \
clientdistro=sles15sp5 testlist=sanity
Change-Id: I5a9afb313e9bf315ef4af5b6602785ee68c4c247
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53172
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Jian Yu [Thu, 9 Nov 2023 19:01:19 +0000 (11:01 -0800)]
LU-17274 kernel: new kernel [RHEL 9.3 5.14.0-362.8.1.el9_3]
This patch makes changes to support new RHEL 9.3 release
for Lustre client.
Lustre-change: https://review.whamcloud.com/53054
Lustre-commit: TBD (from
9146471f862d6c6fae6c1f6ac99f55d8280a2891)
Test-Parameters: trivial env=SANITY_EXCEPT="906" \
mdtcount=4 mdscount=2 clientdistro=el9.3 testlist=sanity
Test-Parameters: optional clientdistro=el9.3 testgroup=full-part-1
Test-Parameters: optional clientdistro=el9.3 testgroup=full-part-2
Test-Parameters: optional clientdistro=el9.3 testgroup=full-part-3
Change-Id: I9cce1a7d2249cb4df39106c44ba4417411ee0757
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53056
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Serguei Smirnov [Tue, 24 Aug 2021 20:48:41 +0000 (13:48 -0700)]
LU-14955 lnet: Use fatal NI if none other available
Allow NI in fatal state to be selected for sending if there are no
NIs in non-fatal state.
Lustre-change: https://review.whamcloud.com/44746/
Lustre-commit:
ff3322fd0c77a8042558711d9f410326d2aa6375
Test-Parameters: trivial testlist=sanity-lnet
HPE-bug-id: LUS-11019
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Iab8ef6ee5c5f45896196dbd88a2f61e004278297
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53153
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Andreas Dilger [Tue, 14 Nov 2023 22:38:26 +0000 (15:38 -0700)]
RM-620 build: New tag 2.14.0-ddn115
New tag 2.14.0-ddn115
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I8d964022825701d68ab711fb7fd5c22d7c1f6e2b
Sebastien Buisson [Sun, 24 Sep 2023 16:07:44 +0000 (12:07 -0400)]
LU-16374 enc: rename O_FILE_ENC to O_CIPHERTEXT
Rename O_FILE_ENC to O_CIPHERTEXT as per discussion in linux-fscrypt
mailing-list.
Also change the flag combination to be:
O_NOCTTY | O_NDELAY | O_DSYNC
to avoid the risk of accidental issues with tar that already opens
files with the 'O_NOCTTY | O_NDELAY' combination.
O_DSYNC does not make much sense for O_RDONLY files, but will force
writes on encrypted restore to be synchronous. With O_DIRECT and large
enough writes (32MB?) that might be OK, but not ideal for small files.
Lustre-Commit:
ac522557b1fe3ea2b7275fa6d5df73691b8d06db
Lustre-Change: https://review.whamcloud.com/51640
Fixes:
4869c7a530 ("LU-14677 sec: no encryption key migrate/extend/resync/split")
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I36fed17a413ee690bc445c3e76674ed5fc337de5
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53049
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Mikhail Pershin [Fri, 13 Oct 2023 21:28:58 +0000 (00:28 +0300)]
LU-17184 mgc: remove damaged local configs
If local config llog is damaged it can't be removed and
prevents target from mounting. This happens because
mgc_llog_local_copy() uses llog_erase() to remove llogs
which can't do the job if llog header is damaged.
Patch changes are:
- llog_erase() to don't initialize header but just destroy
llog file
- mgc_llog_local_copy() to don't exit on backup to temp
file but continue with remote llog copying anyway
- conf-sanity test_151 is added to check that target can
mount with damaged local config
Lustre-change: https://review.whamcloud.com/52697
Lustre-commit:
6a6e4ee20fe5aaad4beab5477e1c7d05e4e702e2
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I637749c38fd5ed03bdac5ca1cd60196f724ab0d1
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53124
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Artem Blagodarenko [Fri, 13 Oct 2023 07:49:07 +0000 (15:49 +0800)]
LU-16032 osd: move unlink of large objects to separate thread
Final unlink and freeing of blocks for large objects can lead to
a thread hung with this call stack:
Net: Service thread pid 1739 was inactive for 200.16s.
The thread might be hung, or it might only be slow and will
resume later.
Dumping the stack trace for debugging purposes:
__wait_on_buffer+0x2a/0x30
ldiskfs_wait_block_bitmap+0xe0/0xf0 [ldiskfs]
ldiskfs_read_block_bitmap+0x31/0x60 [ldiskfs]
ldiskfs_free_blocks+0x329/0xbb0 [ldiskfs]
ldiskfs_ext_remove_space+0x8a9/0x1150 [ldiskfs]
ldiskfs_ext_truncate+0xb0/0xe0 [ldiskfs]
ldiskfs_truncate+0x3b7/0x3f0 [ldiskfs]
ldiskfs_evict_inode+0x58a/0x630 [ldiskfs]
evict+0xb4/0x180
iput+0xfc/0x190
osd_object_delete+0x1f8/0x370 [osd_ldiskfs]
lu_object_free.isra.30+0x68/0x170 [obdclass]
lu_object_put+0xc5/0x3e0 [obdclass]
ofd_destroy_by_fid+0x20e/0x500 [ofd]
ofd_destroy_hdl+0x267/0x9f0 [ofd]
tgt_request_handle+0xaee/0x15f0 [ptlrpc]
ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
ptlrpc_main+0xb34/0x1470 [ptlrpc]
kthread+0xd1/0xe0
Let's move final unlink to workqueue if inode size > 1GB. The size
threshold be configured by setting the minimum async truncate size
with the "osd-ldiskfs.*.delay_unlink_mb" parameter.
Writes to "osd-ldiskfs.*.force_sync" parameter will flush pending
delayed unlinks so that space can be reclaimed as needed.
Lustre-change: https://review.whamcloud.com/47995
Lustre-commit:
a772e90243ea0ff1de6ae9c67e1f6384c431d200
Change-Id: Id535ae4c58732769effabee42835bc2da8cb5cc1
Signed-off-by: Artem Blagodarenko <ablagodarenko@whamcloud.com>
DDN-bug-id: DDN-3144
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53104
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Vitaliy Kuznetsov [Fri, 10 Nov 2023 20:35:56 +0000 (21:35 +0100)]
LU-16827 obdfilter: Fix "emfperf obdfilter-survey" error
This patch fixes the definition of the lctl variable. It changes
the logic so that the LCTL value is assigned only when it was
defined earlier.
Lustre-change: https://review.whamcloud.com/53083
Lustre-commit:
95387e580a639eb9ff0648aecf69d0a4951325ef
Test-Parameters: trivial testlist=obdfilter-survey
Signed-off-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Change-Id: I4dfd7e3d1f78208b33b897d8e6680e59b690014c
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53084
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Timothy Day [Sat, 11 Mar 2023 22:55:09 +0000 (22:55 +0000)]
LU-16632 tests: more margin of error for sanity/56xh
Give sanity test_56xh more time to migrate files inside the
VMs before failing.
Also, fix a typo.
Lustre-change: https://review.whamcloud.com/50262
Lustre-commit:
36cbba150bce9e2890c8b462ec2ce4af2d6353a5
Test-Parameters: trivial testlist=sanity env=ONLY=56xh,ONLY_REPEAT=100
Fixes:
55968bfabe ("LU-13482 utils: bandwidth limit for lfs migrate")
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: If89c8c3ee113c8a14d4c0463c7bb79e353130c08
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53086
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Chris Horn [Thu, 2 Nov 2023 19:28:45 +0000 (12:28 -0700)]
LU-17258 socklnd: ensure connection type established upon race
When a connection race is hit between two peers, only increment the
retry count if a connection of the specific type has already been
established; otherwise, this can lead to an unexpected value set in
ksnr_connected and some of the assertions being triggered in
ksocknal_connect():
"ASSERTION( (wanted & ((((1UL))) << (3))) != 0 ) failed"
Lustre-change: https://review.whamcloud.com/52957
Lustre-commit:
5afe3b0538c533c3cca370bc9c0901abccca299a
Fixes:
da893c6c97 ("LU-16191 socklnd: limit retries on conns_per_peer mismatch")
HPE-bug-id: LUS-11922
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Signed-off-by: Nikitas Angelinas <nikitas.angelinas@hpe.com>
Change-Id: I6e8abb39ad3c0bcd7fbc8f8c5478c903029df908
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53046
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Andreas Dilger [Fri, 10 Nov 2023 09:38:19 +0000 (02:38 -0700)]
RM-620 build: New tag 2.14.0-ddn114
New tag 2.14.0-ddn114
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ia94862790d1dec3d8080b6d00445ca163afebf81
Patrick Farrell [Wed, 1 Nov 2023 20:14:12 +0000 (16:14 -0400)]
EX-7601 osc: walk chunk unaligned RPC correctly
For decompression, the client must start looking for
compressed chunks at a chunk aligned offset.
Implement this in decompress_request.
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I3273135990ddf51e8b3c651734e19350e91f659c
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52933
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Patrick Farrell [Fri, 3 Nov 2023 19:56:27 +0000 (15:56 -0400)]
EX-7601 osc: remove unused 'wrkmem'
compress_chunk() takes a wrkmem buffer, which it does not
use.
Remove this and its allocation in compress_request.
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I6f236f018f5b79c57cc8725ca0f95125810a4064
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52980
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Patrick Farrell [Fri, 3 Nov 2023 18:11:46 +0000 (14:11 -0400)]
EX-7601 osc: apply compressed flag to dst page
The existing code to apply brw flags to compressed pages
has two issues:
1. The dst_page is NOT an osc async page, it is a bare BRW
page. This means the brw_page2oap macro isn't right,
because there is no oap page.
Because oap_brw_flags is actually oap_brw_page.flag, we
don't ever access the memory pointed at by OAP, just use it
to find an offset back in to the brw page.
This means the flags are set correctly, but we still
shouldn't use this macro.
2. However, the function then overwrites these flags by
copying from a page in the source, so OBD_BRW_COMPRESSED is
lost.
Add OBD_BRW_COMPRESSED when we set flags. This ensures the
flag is actually sent to the server on compressed IO.
This was not causing any problems because the server does
not actually use the OBD_BRW_COMPRESSED flag yet.
(EX-7601 uses this flag)
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ia94cdc803868ce16a0b66fd58578ec8b2d00cbae
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52979
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Andreas Dilger [Thu, 9 Nov 2023 00:10:05 +0000 (17:10 -0700)]
EX-8270 sptlrpc: don't crash for too-large chunk size
If the chunk size is too large, don't fall off the
end of the page_pool[] array with a large "order".
Test-Parameters: trivial
Fixes:
d945f1b064 ("EX-6261 ptlrpc: extend sec bulk functionality")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I192ac1b227f1cab8405f6657e754101d353ebbe5
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53044
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Bobi Jam [Wed, 23 Aug 2023 16:43:56 +0000 (00:43 +0800)]
EX-7806 csdc: not support data compression on MDT
Do not support setting data compression component on DoM until
data compression on MDT implemented.
Test-Parameters: trivial testlist=sanity-pfl
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I3794460140f08a073377c418dd56e7dda907d96d
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52062
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Andreas Dilger [Thu, 9 Nov 2023 00:29:26 +0000 (17:29 -0700)]
EX-7601 csdc: improve preview warning messages
Avoid printing duplicate warning messages on the console when
creating files with multiple compressed components. On the
flip side, log a console message when compression is enabled
so that this will later be visible if enabled on a system.
Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I8cb2f67689824513335f3fa65e9ea751923ebbe5
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53045
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Aurelien Degremont [Tue, 17 Oct 2023 13:07:45 +0000 (15:07 +0200)]
LU-17205 utils: add lctl get_param -H option
- Add a new '-H' option to 'lctl get_param' that will prefix
each output line with the parameter name instead of only
the first line by default.
That makes grepping lctl get_param with wildcards much easier
as you can now easily know which parameter returns which value.
$ lctl get_param -H osc.*.state | grep current
osc.lustre-OST0000-osc-
ff1148c0.state=current_state: FULL
osc.lustre-OST0001-osc-
ff1248c0.state=current_state: DISCONN
osc.lustre-OST0002-osc-
ff1348c0.state=current_state: FULL
osc.lustre-OST0003-osc-
ff1448c0.state=current_state: FULL
osc.lustre-OST0004-osc-
ff1548c0.state=current_state: FULL
It also prints an output line even for empty values. That also
makes like easier for admins.
- The patch also removes the force line feed if the parameter
value was larger than 80 chars. This was considered a misfeature
and is now drop for all usages, with or without -H.
Lustre-change: https://review.whamcloud.com/52730
Lustre-commit:
a12c352a3dd8d424b1da09efc6884530c60d105b
Test-Parameters: trivial
Signed-off-by: Aurelien Degremont <adegremont@nvidia.com>
Change-Id: Ib1fa0dc400db4c19fed10ad4cced9be5668418e3
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53067
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Andreas Dilger [Mon, 13 Mar 2023 22:08:30 +0000 (16:08 -0600)]
LU-16639 misc: cleanup concole messages
The lprocfs_job_cleanup() was not properly dropping all jobstats
from the hash table and printing errors from job_stat_exit() at
unmount. Ensure all stats are "old enough" when @clear is set.
Change early libcfs cfs_cpu_init() messages from CERROR() to
pr_err() to avoid circular dependencies on libcfs setup before
printing an error message to the console during module init.
Lustre-change: https://review.whamcloud.com/50283
Lustre-commit:
8f40a3d7110da1af8e310a4b7f40b86f13080938
Test-Parameters: trivial
Fixes:
ea2cd3af7b ("LU-11407 obdclass: add start time to stats files")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ide3f502103392a79419cc1836200bf5a1a3ebbe5
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Feng Lei <flei@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53063
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Andreas Dilger [Thu, 9 Nov 2023 23:28:45 +0000 (16:28 -0700)]
LU-17251 tests: use stderr in precreated_ost_obj_count()
Write the status output to stderr instead of stdout, so that
it doesn't confuse the caller that is expecting the number
of objects precreated in stdout.
Test-Parameters: trivial testlist=sanity-scrub,sanity-lfsck
Fixes:
c39bdce94f ("LU-17251 test: improve parallel-scale rr_alloc test")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ib9b132a04a88b15cea34872954bfa5c4ddf8cde7
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53062
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Andreas Dilger [Thu, 9 Nov 2023 09:38:47 +0000 (02:38 -0700)]
RM-620 build: New tag 2.14.0-ddn113
New tag 2.14.0-ddn113
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I73eab3dc06a0488b7e68c7434cb8a6af2c590a2f
Andreas Dilger [Thu, 9 Nov 2023 09:38:11 +0000 (02:38 -0700)]
RM-620 build: New tag lipe-2.36
New tag lipe-2.36
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I6c986a17d42f4bd95009d9d0f03acc601c9ee2dd
Andreas Dilger [Wed, 8 Nov 2023 22:39:28 +0000 (15:39 -0700)]
EX-7601 tests: improve/skip sanity test_460a
Skip sanity test_460a for el9.2 clients, since they appear to be
failing that test regularly, but no other distro client is.
Improve the log messages to see what stage is currently running.
Limit the "cmp --verbose" messages to one chunk, otherwise it
may print the entire 14MB test file (about 80 MiB of ASCII).
Move enable_compression() and disable_compression() functions
into test-framework.sh so that they can be used for all tests.
Set LFS_SETSTRIPE_COMPR_OK=y in enable_compression() since we
already know this is a preview and don't need it printed.
Allow sanity-compr.sh to specify SANITY_ONLY and/or SANITYN_ONLY,
and skip the other test script run if only one of them is set.
Test-Parameters: trivial
Test-Parameters: testlist=sanity env=ONLY=460,HONOR_EXCEPT=y clientdistro=el9.2
Test-Parameters: testlist=sanity-compr env=SANITY_ONLY=460 clientdistro=ubuntu2204
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I8cb2f67689824513335f3fa65e9ea7519e3ebbe5
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53043
Tested-by: jenkins <devops@whamcloud.com>
Lei Feng [Wed, 8 Nov 2023 08:01:05 +0000 (16:01 +0800)]
EX-8570 lipe: add lpcc sub command to trigger purge scan
Add a sub command 'lpcc purge-scan' to trigger purge
scanning by sending SIGUSR2 to matching lpcc_purge
process.
Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: trivial
Change-Id: I976621fe787daa15b8206eed97efdebe75cd7425
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53036
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Lei Feng [Wed, 8 Nov 2023 04:46:45 +0000 (12:46 +0800)]
EX-8569 lipe: trigger lpcc_purge scan by SIGUSR2
send SIGUSR2 to lpcc_purge to trigger a scanning
immediately.
Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: trivial
Change-Id: I2811c90ac75c93167e8104e90b424ac31c8cc50c
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53034
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Lei Feng [Wed, 8 Nov 2023 02:07:32 +0000 (10:07 +0800)]
EX-8568 lipe: lpcc_purge can disable force scanning
when force_scan_interval is set to -1, lpcc_purge will never
start force scanning.
Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: trivial
Change-Id: I21bcadb97f09622eae08af73082196e816b2c9ae
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53032
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Lei Feng [Mon, 6 Nov 2023 06:40:13 +0000 (14:40 +0800)]
EX-4125 lipe: adjust atime in lpcc_purge
Some time atime < mtime. In this case, adjust atime with mtime.
Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: trivial testlist=sanity-pcc env=ONLY="200-203"
Change-Id: I35b3da543b57265b09ef65f4e810761aa727f483
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53002
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Lei Feng [Tue, 7 Nov 2023 04:06:24 +0000 (12:06 +0800)]
EX-8551 lipe: build arch-specific lipe-lpcc package
lpcc_purge in lipe-lpcc package is an exec binary.
So need arch-specific lipe-lpcc package.
Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: trivial
Change-Id: I0387e258eaec6e39156f823d3a38b5dc3fb9a4cd
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53007
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Raphael Druon <rdruon@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Shaun Tancheff [Tue, 22 Feb 2022 07:28:50 +0000 (01:28 -0600)]
LU-15576 osp: Interop skip sanity test 823 for MDS < 2.14.0-ddn112
Prior to v2_14_55-29-g06e586016d setting create_count greater
than the maximum returned -ERANGE.
During interop testing skip sanity/823 for MDS older than 2.14.0-ddn112.
Lustre-change: https://review.whamcloud.com/46567
Lustre-commit:
5da859e262dd5e93bfeb2bfa1366a9e20395d3f4
Test-Parameters: trivial serverversion=2.14.0 testlist=sanity env=ONLY=823
Fixes:
06e586016d3a ("LU-13941 osp: Silently lower requested create_count to maximum")
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Ie79617deea047b2a846f696473b9c2b5681953be
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53022
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Andreas Dilger [Fri, 3 Nov 2023 23:49:29 +0000 (17:49 -0600)]
LU-10465 osd-ldiskfs: 8MiB IOs should bypass cache
Changes the writethrough_max_io_mb and readcache_max_io_mb
params to check for IO size >= max_io_mb instead of > max_io_mb
when deciding to bypass cache.
Read/write IOs that are 8MiB in size should bypass the pagecache
on the OSTs, rather than requiring IOs that are slightly larger
than this. 8MiB is enough to submit 1MiB to each HDD spindle in
an 8+2 RAID6, and caching these writes on the OSS is not helping.
Lustre-change: https://review.whamcloud.com/52989
Lustre-commit: TBD (from
dcdc4748f1443981a170bc2945b178226e64a6d4)
Test-Parameters: trivial
Fixes:
3043c6f189 ("LU-12071 osd-ldiskfs: bypass pagecache if requested")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Iae435f5b99e2e8bc6a9458fedad65a81c2853350
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53033
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Bobi Jam [Thu, 21 Sep 2023 14:24:32 +0000 (22:24 +0800)]
LU-16958 llite: migrate deadlock on not responding lock cancel
lfs migrate race makes MDS hang with following backtrace
[ 3683.248584] [<0>] ldlm_completion_ast+0x78d/0x8e0 [ptlrpc]
[ 3683.250122] [<0>] ldlm_cli_enqueue_local+0x2fd/0x840 [ptlrpc]
[ 3683.251363] [<0>] mdt_object_local_lock+0x50e/0xb10 [mdt]
[ 3683.252615] [<0>] mdt_object_lock_internal+0x187/0x430 [mdt]
[ 3683.253793] [<0>] mdt_object_lock_try+0x22/0xa0 [mdt]
[ 3683.254857] [<0>] mdt_getattr_name_lock+0x1317/0x1dc0 [mdt]
[ 3683.256016] [<0>] mdt_intent_getattr+0x264/0x440 [mdt]
[ 3683.257105] [<0>] mdt_intent_opc+0x452/0xa80 [mdt]
[ 3683.258126] [<0>] mdt_intent_policy+0x1fd/0x390 [mdt]
[ 3683.259191] [<0>] ldlm_lock_enqueue+0x469/0xa90 [ptlrpc]
[ 3683.260350] [<0>] ldlm_handle_enqueue0+0x61a/0x16c0 [ptlrpc]
[ 3683.261596] [<0>] tgt_enqueue+0xa4/0x200 [ptlrpc]
[ 3683.262662] [<0>] tgt_request_handle+0xc9c/0x1a40 [ptlrpc]
[ 3683.263860] [<0>] ptlrpc_server_handle_request+0x323/0xbd0 [ptlrpc]
[ 3683.265220] [<0>] ptlrpc_main+0xbf3/0x1540 [ptlrpc]
[ 3683.266303] [<0>] kthread+0x134/0x150
[ 3683.267111] [<0>] ret_from_fork+0x35/0x40
The deadlock happens as follows:
T1:
vvp_io_init()
->ll_layout_refresh() <= take lli_layout_mutex
->ll_layout_intent()
->ll_take_md_lock() <= take the CR layout lock ref
->ll_layout_conf()
->vvp_prune()
->vvp_inode_ops() <= release lli_layout_mtex
->vvp_inode_ops() <= try to acquire lli_layout_mutex
-> racer wait here for T2
T2:
->ll_file_write_iter()
->vvp_io_init()
->ll_layout_refresh() <= take lli_layout_mutex
->ll_layout_intent() <= Request layout from MDT
-> racer wait from server...
And server want to cancel the CR layout lock T1 hold, and it won't
happen. Also T1 could has take extent ldlm lock while waiting
lli_layout_mutex hold by T2, and ofd_destroy_hdl does not get the
lock cancellation response from T1.
lli_layout_mutex is only needed for enqueuing layout lock from server,
so that ll_layout_conf() does not involve with lli_layout_mutex.
Lustre-commit: TBD (from
7de620b53bea8a2fc252ceea4787f1226ce63a02)
Lustre-change: https://review.whamcloud.com/52388
Fixes:
8f2c1592c3 ("LU-16958 llite: migrate vs regular ops deadlock")
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Ib94de2c63544c3a962199aa0537418255980ae8c
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52451
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Vladimir Saveliev [Wed, 26 Jul 2023 13:09:18 +0000 (16:09 +0300)]
LU-16043 osc: allow error for write on CL_FSYNC_DISCARD
If case of CL_FSYNC_DISCARD error is allowed for write of osc object.
Otherwise, the included test fails in rm with:
(osc_page.c:174:osc_page_delete()) Trying to teardown failed: -16
(osc_page.c:175:osc_page_delete()) ASSERTION( 0 ) failed:
(osc_page.c:175:osc_page_delete()) LBUG
Lustre-change: https://review.whamcloud.com/48032
Lustre-commit:
050c2fb23b1f98745305a3dfe3062ea5a66dfdb4
Test-Parameters: trivial testlist=sanity env=ONLY=907
HPE-bug-id: LUS-10410
Signed-off-by: Vladimir Saveliev <vladimir.saveliev@hpe.com>
Change-Id: I0aae0dc470ba0371964e7643a6d84b19a1b4e106
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53009
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Andrew Perepechko [Tue, 10 Jan 2023 21:53:38 +0000 (16:53 -0500)]
LU-16609 target: top_trans_create cannot alloc memory
top_trans_create() requests __GFP_IO memory allocation,
which does not allow direct reclaim. However, if the
memory shortage is temporary, direct reclaim is reasonable.
GFP_NOFS is __GFP_IO with additional reclaim bits.
Lustre-change: https://review.whamcloud.com/50176
Lustre-commit:
9d1f8f1e3557ee3349c623f4f5596df44f60b082
Change-Id: I2c84d9d74188660063c948573780745a2b59a688
Signed-off-by: Andrew Perepechko <andrew.perepechko@hpe.com>
HPE-bug-id: LUS-11293
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53031
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Shaun Tancheff [Wed, 18 Oct 2023 03:54:59 +0000 (22:54 -0500)]
LU-17197 obdclass: preserve fairness when waiting for rpc slot
When obd_get_mod_rpc_slot() waits for an available slot it places the
waiting thread at the HEAD of the queue, so it will be woken before
anything else that is already queued. This is clearly unfair and can
hurt performance.
So change to always add to the tail to ensure a FIFO ordering (except
that CLOSE might sometimes be woken a bit early).
This regression was introduced in a rewrite that was supposed to make
waiting more fair - by avoiding a broadcast wakeup for "close"
requests.
Also fix some stale comments and expose __add_wait_queue_entry_tail
Running mdtest with the patch applied shows about a 3% improvement:
master patched
mdtest-easy-write 350.585906 kIOPS 353.783545 kIOPS
mdtest-easy-stat 1320.329353 kIOPS 1408.320419 kIOPS
mdtest-easy-delete 285.084103 kIOPS 289.625900 kIOPS
[SCORE] 509.115803 kiops 524.516113 kiops
Lustre-change: https://review.whamcloud.com/52738
Lustre-commit: TBD (from
7e28964085a4d98111b926fe125abc7f815e70be)
Fixes:
5243630b09d2 ("LU-15947 obdclass: improve precision of wakeups for mod_rpcs")
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: If767c4299bcbab71589b0f3c01e85bf461686ca5
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52886
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Alex Deiter [Wed, 1 Nov 2023 22:54:32 +0000 (02:54 +0400)]
LU-17251 test: improve parallel-scale rr_alloc test
Added checking for pre-created OST objects and waiting
(maximum 60 seconds) before executing the rr_alloc test.
Lustre-change: https://review.whamcloud.com/52940
Lustre-commit: TBD (from
3f1f70264e1ed9ba77094435fc598bc9abbbc044)
Test-Parameters: trivial
Test-Parameters: testlist=parallel-scale env=ONLY=rr_alloc,ONLY_REPEAT=8
Test-Parameters: testlist=parallel-scale env=ONLY=rr_alloc,ONLY_REPEAT=8
Test-Parameters: testlist=parallel-scale env=ONLY=rr_alloc,ONLY_REPEAT=8
Test-Parameters: testlist=parallel-scale env=ONLY=rr_alloc,ONLY_REPEAT=8
Test-Parameters: testlist=parallel-scale env=ONLY=rr_alloc,ONLY_REPEAT=8
Test-Parameters: testlist=parallel-scale env=ONLY=rr_alloc,ONLY_REPEAT=8
Signed-off-by: Alex Deiter <adeiter@tintri.com>
Change-Id: Ib604b99138ceccf384476ad2876d9df7cd7d524b
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52999
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Andreas Dilger [Fri, 3 Nov 2023 00:32:44 +0000 (18:32 -0600)]
LU-17251 osp: force precreate if create_count grows
Force the MDS to precreate OST objects if "osp.*.create_count" is
written and the OSP does not have at least that many precreated
objects locally. This avoids doing complex operations in test
scripts to force precreation to run, which slows down the tests
and increases the chance that a test might fail.
Previously opd_precreate_force was only used for handling OSTs
that were reformatted and this reset "create_count" to minimum, so
move that to the reformat case rather than in the precreate code
path so it does not reset "create_count" when it was just set.
Remove the "env" argument from several precreate-related functions,
since it wasn't used in those functions, and that made it difficult
to call them from the "create_count" parameter handling.
Lustre-change: https://review.whamcloud.com/52968
Lustre-commit: TBD (from
0206ef4d765aca3f298e24dd630f155114781986)
Test-Parameters: testlist=parallel-scale env=ONLY=test_rr_alloc
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Iac35c1b981fcd6ab2d1ea5abc9ffe2e4563ebbe5
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52998
Tested-by: jenkins <devops@whamcloud.com>
Mikhail Pershin [Wed, 1 Nov 2023 14:55:39 +0000 (17:55 +0300)]
LU-17249 ptlrpc: protect scp_rqbd_idle list operations
Protect scp_rqbd_idle list entry getting by spinlock
in ptlrpc_service_purge_all() like it does in all
other places where rqbd_list linkage is being managed
Lustre-change: https://review.whamcloud.com/52931
Lustre-commit:
9ba375983d498690f5caa29c289c137470a76505
Test-Parameters: testgroup=full-part-1 env="SLOW=yes,ENABLE_QUOTA=yes"
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: Iace37b1ee79bfd0c3a54a35722952e17d860a91c
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52952
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Serguei Smirnov [Tue, 26 Sep 2023 23:57:46 +0000 (16:57 -0700)]
LU-17103 lnet: use workqueue for lnd ping buffer updates
Introduce workqueue for handling lnd-initiated ping buffer
update requests.
This is done to avoid the possibility of monitor thread
lock up waiting for the "old" ping buffer refcount to get
decremented during the update, while the message which
triggers the decrement is on the monitor thread's own queue
waiting to be processed.
Lustre-change: https://review.whamcloud.com/52522/
Lustre-commit: TBD (from
1200e9ce1b8272f4affb20386570a9a6e79ceeb4)
Test-Parameters: trivial
Test-Parameters: testlist=sanity-lnet env=ONLY="207 500",ONLY_REPEAT=50
Fixes:
7ac399c5 ("LU-16949 lnet: get monitor thread to update ping buffer")
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: I5176581703e52f4adbfff417040bebcc2489b79e
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52936
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Serguei Smirnov [Tue, 17 Oct 2023 18:43:14 +0000 (11:43 -0700)]
LU-17207 lnet: race b/w monitor thr stop and discovery push
As a result of race, discovery thread may attempt to dereference
a message on ln_mt_resendqs which was just freed by monitor thread
stopping. Make sure discovery thread is stopped first.
Lustre-change: https://review.whamcloud.com/52734/
Lustre-commit: TBD (from
5c6ca4991382a805da6e824c1dbfab931987dda6)
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: I0dfcf3bc5bb3c8df195388599f571bdd3caaa3d7
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52935
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Sebastien Buisson [Tue, 7 Nov 2023 16:00:37 +0000 (17:00 +0100)]
EX-8543 tools: remove laudit/laudit-report
laudit/laudit-report is a demonstration tool for what is possible in
terms of Lustre audit. It is not meant to be used in production
because it stores the audit data as plaintext flat files, which is
both not secure and not scalable. And it is largely untested at scale.
So remove laudit/laudit-report from lipe sources, and fix build and
packaging mechanisms accordingly.
Test-Parameters: trivial
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I36fbd50cd4485f2cc7b0ee91922e58f92e008255
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53015
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Li Dongyang [Mon, 6 Nov 2023 04:22:47 +0000 (15:22 +1100)]
LU-17248 kernel: wait for pages under writeback for bdev
Use a better version of kernel patch instead of
just adding SB_I_STABLE_WRITES flag to bdev superblock.
We don't need to wait for page writeback for all block devices,
even for those don't require stable_page.
Test-Parameters: trivial
Fixes:
5968bc3954 ("LU-17248 kernel: add SB_I_STABLE_WRITES to bdev sb flag")
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Change-Id: I20cfa33c4ef45b10e6a732e325698c6b1b00bc79
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53001
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Alex Zhuravlev [Tue, 7 Nov 2023 14:56:16 +0000 (17:56 +0300)]
LU-16843 ldiskfs: merge extent blocks
There are cases (e.g. file written synchronously with discontiguous
blocks that are later filled in) when a lot of extents are created
initially, then the extents get merged over time, but there is no
way to merge the index blocks. This can cause a very deep extent
index tree (above 5 levels) and cause problems like:
inode has invalid extent depth: 6
Merge leave/index blocks (one at each level at most) to right/left
when extents are removed from the index.
submitted to ext4@ maillist:
https://lore.kernel.org/linux-ext4/
7A2B8861-96AA-4815-BB58-
180F63F62436@whamcloud.com/
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I09dfab193d82e7c99620ddb95aff2015023f73aa
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52301
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Alex Zhuravlev [Mon, 30 Oct 2023 10:16:49 +0000 (13:16 +0300)]
EX-8369 ldiskfs: fix dense writes
don't mix "dense" and regular writes as regular are bound
to logical offsets.
Fixes:
f36eda6a1e ("LU-10026 osd-ldiskfs: use preallocation for dense writes")
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I9f6b2c600f2132dcad23726f2fb3848ab02cc117
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52888
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Aurelien Degremont [Tue, 7 Nov 2023 19:38:53 +0000 (11:38 -0800)]
LU-17254 lnet: Fix ofed detection with specific kernel version
Improve OFED configure step with LNET when the kernel version
is using special characters that could be interprated in regexp
mode.
This is not uncommon in Debian world to have '+' in kernel version.
Lustre-change: https://review.whamcloud.com/52949
Lustre-commit:
b83156304df2d418aadb5d3dfd5f570ef72a7e2e
Test-parameters: trivial
Change-Id: Ia3da59c74d8c2e59e16525dd50c7b83c2b5fada8
Signed-off-by: Aurelien Degremont <adegremont@nvidia.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53021
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Colin Faber <cfaber@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Jian Yu [Tue, 7 Nov 2023 19:47:01 +0000 (11:47 -0800)]
LU-17257 build: use pkg-config to find krb5 libdir
This patch fixes kerberos5.m4 to use pkg-config to
find krb5 libdir instead of looking for the krb5
libraries in a static list of path.
Lustre-change: https://review.whamcloud.com/53010
Lustre-commit: TBD (from
9cccb643173acf536f542103d47e4af7057c0ff9)
Test-Parameters: trivial kerberos=true testlist=sanity-krb5
Change-Id: Ia15812932942171b019f3e73034a78f9185c16ce
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53024
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Colin Faber <cfaber@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Timothy Day [Wed, 8 Nov 2023 20:22:54 +0000 (12:22 -0800)]
LU-16518 utils: fix clang build errors
This patch fixes a number of small clang build
errors in Lustre utils. Many errors are related
to nuances in typing or statements which appear
to be tautologies. These are resolved.
Some unneeded paranthesis are removed. A variable
is initialized which could potentially be left
uninitialized. And a comparison was added that
seemed to be left out.
Lustre-change: https://review.whamcloud.com/50161
Lustre-commit:
632dc6729abcaf83aeaef8167a73ce18b9a41a67
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Id3f40b033e640f8d2ae6386f66a88de06fc89666
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53042
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Aurelien Degremont [Tue, 7 Nov 2023 19:43:48 +0000 (11:43 -0800)]
LU-17256 debian: allow building client dkms on arm64
Just add 'arm64' on the supported architecture list
for 'lustre-client-modules-dkms' debian package.
Lustre-change: https://review.whamcloud.com/52951
Lustre-commit:
c4c9a8eea31cf9aa02f75ca3f119f90d67c70cc5
Test-Parameters: trivial
Change-Id: I2af307ee87448faeec81f6e0e27573ae980710f1
Signed-off-by: Aurelien Degremont <adegremont@nvidia.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53023
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Colin Faber <cfaber@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Andreas Dilger [Sun, 5 Nov 2023 10:52:56 +0000 (03:52 -0700)]
RM-620 build: New tag 2.14.0-ddn112
New tag 2.14.0-ddn112
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ibd5e877813d29da337ac1343dcdd3223ef2e7355
Andreas Dilger [Sun, 5 Nov 2023 10:52:20 +0000 (03:52 -0700)]
RM-620 build: New tag lipe-2.35
New tag lipe-2.35
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I85314a2e67b809e0ebe40d3428db6ab19a5c554a
Jian Yu [Thu, 2 Nov 2023 18:54:42 +0000 (11:54 -0700)]
LU-16667 build: fix extra errors related to struct mnt_idmap
This patch fixes the following extra build errors related to
struct mnt_idmap:
lustre/llite/pcc.c:2440:40: error: passing argument 1 of
'inode_owner_or_capable' from incompatible pointer type
[-Werror=incompatible-pointer-types]
2440 | inode_owner_or_capable(&init_user_ns, inode)) ||
| ^~~~~~~~~~~~~
| |
| struct user_namespace *
include/linux/fs.h:1624:47: note: expected 'struct mnt_idmap *'
but argument is of type 'struct user_namespace *'
1624 | bool inode_owner_or_capable(struct mnt_idmap *idmap,
| ~~~~~~~~~~~~~~~~~~^~~~~
lustre/llite/pcc.c:3656:40: error: passing argument 1 of
'inode->i_op->fileattr_set' from incompatible pointer type
[-Werror=incompatible-pointer-types]
3656 | rc = inode->i_op->fileattr_set(&init_user_ns, dentry, &fa);
| ^~~~~~~~~~~~~
| |
| struct user_namespace *
lustre/llite/pcc.c:3656:40: note: expected 'struct mnt_idmap *'
but argument is of type 'struct user_namespace *'
Change-Id: Ia310f6f9053228b38b41243912dfe7818cfef33a
Test-Parameters: trivial
Fixes: 3011aa5 ("LU-16667 build: struct mnt_idmap, linux/filelock.h")
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52955
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Shaun Tancheff [Thu, 2 Nov 2023 18:50:25 +0000 (11:50 -0700)]
LU-16802 build: compatibility for 6.4 kernels
linux kernel v6.3-rc4-32-g6eb203e1a868
iov_iter: remove iov_iter_iovec()
Provide a replacement iov_iter_iovec() when one is not provided.
linux kernel v6.3-rc4-34-g747b1f65d39a
iov_iter: overlay struct iovec and ubuf/len
This renames iov_iter member iov to __iov and provides the
iov_iter() accessor.
Define __iov as iov when __iov not present.
Provide an iov_iter() for older kernels.
linux kernel v6.3-rc1-13-g1aaba11da9aa
driver core: class: remove module * from class_create()
Provide an ll_class_create() to pass THIS_MODULE, or not,
as needed by class_create().
Linux commit v6.2-rc1-20-gf861646a6562
quota: port to mnt_idmap
Update osd_dquot_transfer to use mnt_idmap and fallback
to user_ns, if needed, by dquot_transfer.
Linux commit v6.3-rc7-2433-gcf64b9bce950
SUNRPC: return proper error from get_expiry()
Updated get_expiry() requires a time64_t pointer to be passed
to hold the expiry time. A non-zero return value indicates an
error, nominally -EINVAL. Provide a wrapper for kernels that
return a time64_t and return -EINVAL on error.
Lustre-change: https://review.whamcloud.com/50875
Lustre-commit: TBD (from
1bd4e67d1f635e0a5f94280c4bab85668ce677ca)
Test-Parameters: trivial
HPE-bug-id: LUS-11614
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I765d6257eec8b5a9bf1bd5947f03370eb9df1625
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52954
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Serguei Smirnov [Fri, 11 Aug 2023 00:58:11 +0000 (17:58 -0700)]
LU-17006 lnet: set up routes for going across subnets
Modify ksocklnd-config to set up route which features
default gateway for the subnet in case if default gateway
is defined, for example:
ip route add default via <gw_for_eth0> dev eth0 table eth0
which results in a route similar to the following added to
the eth0 route table:
default via <gw_for_eth0> dev eth0
If there's no gateway found for the eth0 subnet, keep the old
behaviour which results in the following added to eth0
route table:
<eth0_subnet> dev eth0 proto kernel scope link src <eth0_ip>
This makes sure that MR traffic goes out the intended interface
as selected by LNet no matter whether going across subnets or not.
Lustre-change: https://review.whamcloud.com/51921
Lustre-commit:
7f60b2b5580f67ca55e53a78dbaf7d50b5b7ab47
Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: I84a299c8b7eb4cdb4fc24408a1e42ad0283d9219
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52190
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Chris Horn [Mon, 25 Sep 2023 19:03:20 +0000 (14:03 -0500)]
LU-17103 lnet: Avoid deadlock when updating ping buffer
lnet_peer_send_push() adds a reference to the the_lnet.ln_ping_target
lnet_ping_buffer. This reference is dropped by
lnet_discovery_event_handler. When the LNet configuration is modified
the ln_api_mutex is held and lnet_ping_target_update() is called to
update the ln_ping_target to reflect the new configuration.
While holding the ln_api_mutex, lnet_ping_target_update() will wait
until all refs on the old ping buffer are dropped. This can result
in a deadlock if the ln_api_mutex is required to complete the push.
Here is one scenario where this can happen:
1. PUSH is sent by discovery thread
2. LNet configuration is modified. lnetctl process is holding
ln_api_mutex and waiting in lnet_ping_target_update()
3. Local NI goes into recovery
4. Monitor thread wakes and attempts to send ping to local NI. If this
is the first ping sent to this NI then monitor thread needs
ln_api_mutex to create peer NI object for local NI.
(LNetGet->
lnet_send->
lnet_select_pathway->
lnet_peerni_by_nid_locked->
mutex_lock(&the_lnet.ln_api_mutex))
5. PUSH (1) fails with local timeout. It is placed on monitor thread
resend queue.
6. monitor thread cannot process resend queue until it acquires
ln_api_mutex. ln_api_mutex cannot be acquired until monitor thread
processes resend queue. Deadlock.
Fix is to drop ln_api_mutex before calling lnet_ping_md_unlink() in
lnet_ping_target_update(). This should ensure that updates to the
ping target are still synchronized via ln_api_mutex as intended, but
we're able to clear refs on the old ping buffer as needed.
Lustre-change: https://review.whamcloud.com/52479/
Lustre-commit:
3ca6ba39a21cfebc81bbe7f889c486bb82bb563a
Test-Parameters: trivial
Test-Parameters: testlist=sanity-lnet env=ONLY=207,ONLY_REPEAT=50
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I20cda185a865192f1ad162eaef1b8b4e5d751b2c
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52934
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Li Dongyang [Wed, 1 Nov 2023 11:36:10 +0000 (22:36 +1100)]
LU-17248 kernel: add SB_I_STABLE_WRITES to bdev sb flag
Since RHEL 8.6 wait_for_stable_page() is controlled by
a new flag SB_I_STABLE_WRITES on the super block.
However the new flag is not set on the bdev pseudo sb,
which mean when doing write directly to the block device
we are not waiting on page writeback, this could trigger
false block integrity errors, as page could be modified
again when under writeback, the integrity checksum does
not match the new data any more.
Lustre-change: https://review.whamcloud.com/52922
Lustre-commit: TBD (from
5aeffdbec699abad07ed2326723c7743faadbf8a)
Change-Id: Ie088abf29f40b294c31f993bcfad56d6081a3fce
Test-Parameters: trivial
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52969
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Serguei Smirnov [Mon, 30 Oct 2023 19:13:45 +0000 (12:13 -0700)]
LU-17235 o2iblnd: adding alias ib interface causes crash
Commit
09c6e2b872 (LU-16836) causes o2iblnd startup routine to crash
when alias ib interface is used:
ifconfig ib0:0 10.1.0.52 up
modprobe lnet
lnetctl lnet configure
lnetctl net add --net o2ib --if ib0:0
Fix the code which attempts to set the NI status on startup to deal
with the case when corresponding net_device is not found gracefully.
Lustre-change: https://review.whamcloud.com/52894/
Lustre-commit: TBD (from
26a00e20fad0cd7871c30fe65653415566b498dc)
Test-Parameters: trivial testlist=sanity-lnet
Fixes:
09c6e2b872 ("LU-16836 lnet: ensure dev notification")
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: Iaef9280a10f27ac28b872d9f4bc119c4d459ef40
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52910
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Lei Feng [Fri, 9 Jun 2023 03:04:37 +0000 (11:04 +0800)]
EX-7584 ptlrpc: define nrs_orr_object.oo_ref atomic_t
nrs_orr_object.oo_ref is a reference counter but not atomic type.
nrs_trr_hash_ops.hs_put() is filled with nrs_orr_hop_put(), which
decreases oo_ref without any protection. So change it to atomic_t
to eliminate any potential race condition.
Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: trivial testlist=sanityn env=ONLY=77d,ONLY_REPEAT=100
Change-Id: I69d27eebdddab79d7dd7e279391cd841e438b5d3
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52948
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Shaun Tancheff [Mon, 23 Aug 2021 14:40:39 +0000 (09:40 -0500)]
LU-13941 osp: Silently lower requested create_count to maximum
When setting create_count it should silently accept a larger value
and truncate it to the current maximum.
This would avoid issues if that limit is changed in the future.
Lustre-change: https://review.whamcloud.com/39967
Lustre-commit:
06e586016d3acc490f922e43e3aee6b8112a2803
HPE-bug-id: LUS-5960
Test-Parameters: trivial testlist=parallel-scale,sanity
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I4727ba6fca747e1ae9850188ef63c7abb89904be
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52967
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>