Whamcloud - gitweb
fs/lustre-release.git
17 months agoRM-620 build: New tag 2.14.0-ddn128
Andreas Dilger [Sat, 6 Jan 2024 08:22:54 +0000 (01:22 -0700)]
RM-620 build: New tag 2.14.0-ddn128

New tag 2.14.0-ddn128

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ie84bfc97c732043030769c183a7e8a879bb3e0f1

17 months agoLU-17289 test: fix sanity/906 version check
Andreas Dilger [Thu, 4 Jan 2024 00:07:34 +0000 (00:07 +0000)]
LU-17289 test: fix sanity/906 version check

Fix the version check in test_906 to include RHEL9.3.0.

Change-Id: I7e066cdd16946b541fee96281dd5a5c90daa7072
Fixes: a6739c9c9a ("LU-17289 test: disable sanity/test_906 temporarily")
Test-Parameters: trivial testlist=sanity clientdistro=el9.3
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53580
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
17 months agoLU-17352 utils: lljobstat can read dumped stats files
Lei Feng [Sun, 10 Dec 2023 08:45:38 +0000 (16:45 +0800)]
LU-17352 utils: lljobstat can read dumped stats files

Improve lljobstat command to read dumped stats file.
Usually the file is generated by command:
  lctl get_param *.*.job_stats > all_job_stats.txt

Multiple files can be specified with multiple --statsfile
options. For example:
  lljobstat --statsfile=1.txt --statsfile=2.txt

Stats data from multiple files will be added up and
sorted. Then the top jobs will be listed.

Try to use CLoader to accelerate the YAML parsing.

Handle SIGINT and exit silently if lljobstat is in the loop
of reading system job_stats files periodically.

Fix a bug when the job_id is a pure number.

Lustre-change: https://review.whamcloud.com/53397
Lustre-commit: ef2555d7af21bd35756805b13e6b458f56cecf54

Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: trivial
Change-Id: Iee1ce69d2befb9d021e34effd4fc65a47297c1fb
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53582
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
17 months agoLU-17048 mdd: protect layout change in MDD layer
Bobi Jam [Mon, 28 Aug 2023 13:08:34 +0000 (21:08 +0800)]
LU-17048 mdd: protect layout change in MDD layer

We need to detect changes to the LOD layout in between transaction
declaration and when the objects are locked during transaction
execution. Otherwise, if another thread has modified the layout
of an object used by the transaction then the declaration may
be incorrect.

This patch save objects' layout generation in transaction delaration
phase, and check whether they have been changed by others in the
transaction execution phase, if that's the case, the transaction will
be retried for several times.

Lustre-change: https://review.whamcloud.com/52146
Lustre-commit: d5ab62af24166529b84b4d7227b96d3a69989a95

Fixes: b7bd4e3422 ("LU-14621 mdd: fix lock-tx order in mdd_xattr_merge()")
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I25fe03c6e8fc4eebccc039e62dfc88db1179cb26
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53567
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
17 months agoEX-7601 osc: debug fix in decompress_request
Patrick Farrell [Thu, 4 Jan 2024 18:56:29 +0000 (13:56 -0500)]
EX-7601 osc: debug fix in decompress_request

Debug message had an incorrect subtraction.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I5daf360766ca77b98dc5af3d72c42ac38f5782bc
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53586
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
17 months agoEX-7601 tests: add mmap write test
Patrick Farrell [Fri, 29 Dec 2023 20:10:58 +0000 (15:10 -0500)]
EX-7601 tests: add mmap write test

This improves the existing mmap test to test mmap writing
as well as mmap reading.

Test-Parameters: trivial
Test-Parameters: testlist=sanity-compr env=ONLY="1003",ONLY_REPEAT=10
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I81840c7bbbefbc5c3bae6b270c2d94297a254d19
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53307
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
17 months agoEX-7601 tests: add multi-mount compression test
Patrick Farrell [Fri, 29 Dec 2023 20:10:36 +0000 (15:10 -0500)]
EX-7601 tests: add multi-mount compression test

This adds a multi-mount correctness test for compression.
This races IO from two mountpoints at varying sizes to
stress test compression.

Test-Parameters: trivial
Test-Parameters: testlist=sanity-compr env=ONLY=1006
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: If49cbd6d171068faa802835146f273d835b39bc3
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51842
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
17 months agoEX-7601 tests: tests for read-modify-write
Patrick Farrell [Fri, 29 Dec 2023 20:09:49 +0000 (15:09 -0500)]
EX-7601 tests: tests for read-modify-write

This patch adds tests for the read-modify-write case for
EX-7601.  There's still some additional tests to be added
here, but this is a good start.

Test-Parameters: trivial
Test-Parameters: testlist=sanity-compr env=ONLY="1004 1005",ONLY_REPEAT=10
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I5dd9e566b8274ece99283c8962e0d34225089cc0
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53230
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
17 months agoEX-7601 osc: add check to decompress_request
Patrick Farrell [Tue, 19 Dec 2023 04:19:44 +0000 (23:19 -0500)]
EX-7601 osc: add check to decompress_request

decompress_request should check to see if there's room in
the RPC for the decompressed data, since this can occur if
there's a bug or data corruption, and otherwise we will
go past the end of the RPC during decompression.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ib1bf19bf39701b72f0f5a61b2aaff2f2fdad1897
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53502
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
17 months agoEX-7601 ofd: add checks to io_lnb_to_tx_lnb
Patrick Farrell [Wed, 13 Dec 2023 23:33:16 +0000 (18:33 -0500)]
EX-7601 ofd: add checks to io_lnb_to_tx_lnb

We should always be able to find the remote niobuf in the
local io range, if we can't, there's a bug.  So assert on
this.

We should also never have page level overlapping remote IOs,
at least until we have unaligned DIO.  (We can remove this
check when we combine the features.)

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I325d4a37d25c116e42621964e90b225b71fd8f1f
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53450
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
17 months agoEX-7601 ofd: add past eof check for reads
Patrick Farrell [Tue, 12 Dec 2023 17:35:25 +0000 (12:35 -0500)]
EX-7601 ofd: add past eof check for reads

The client does not normally generate reads past EOF, but
this can occur during some racing situations.  We need to
check for that case and not attempt decompression, since
there's no data to decompress if we're reading past EOF.

This covers a failure which shows up occasionally in the
racing parts of the test suite, but it's challenging to
write an explicit test for this.

We also add handling for complete reads of the last chunk,
even if that chunk is partial, because we can send that to
the client for decompression.

This allows us to remove the slightly funky eof handling
in decompress_rnb, since we'll just not call that code in
this case now.  Note we'll still call decompress_rnb, etc,
for writes if they start before EOF and finish after EOF
(and are unaligned).  This is fine - this case should be
rare and if we hit it, we'll notice there's nothing to
decompress and proceed accordingly.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I50295f2803af611de5069d094c0a5d1b0a4a9c2d
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53428
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
17 months agoEX-7601 ofd: put decompress_read in read prep
Patrick Farrell [Tue, 12 Dec 2023 17:29:22 +0000 (12:29 -0500)]
EX-7601 ofd: put decompress_read in read prep

ofd_decompress_read is called from ofd_write_prep for
writes, but from tgt_brw_read for reads.  This makes the
code a little harder to follow and makes it difficult to
check read side decompression against EOF.

Instead, we move the decompression call to ofd_preprw_read.
This makes no change to the real operations here, but makes
for better code (and more similar code between read and
write).

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ibefd0a48ad08e83725f2df64618db60ba61c5ce0
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53427
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
17 months agoEX-7601 ofd: same-ify preprw_read and preprw_write
Patrick Farrell [Tue, 12 Dec 2023 17:07:35 +0000 (12:07 -0500)]
EX-7601 ofd: same-ify preprw_read and preprw_write

preprw_read and preprw_write have some sections which are
functionally the same but which have diverged slightly.
(These can't easily be shared between the functions.)

This is a short patch to make them more similar before
adding eof checking to reads.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I7bce912e99e61a4eec4060d6b49d4917894b44c4
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53426
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
17 months agoEX-7601 ofd: don't read for writes past eof
Patrick Farrell [Tue, 12 Dec 2023 16:37:48 +0000 (11:37 -0500)]
EX-7601 ofd: don't read for writes past eof

There's no data past EOF, so there's no need to do
read-modify-writes when the entire write is past the chunk
at EOF.  So in that case, don't read up data and don't
attempt decompression.

There's no explicit test for this, but this shows up
immediately in the random-offset copy tests, because they
seek and write various sizes to offsets past current EOF.

We also need this functionality for reads, because in some
cases the client will do reads past EOF (this is unusual,
but can still happen sometimes).  This is added in a
separate patch because it requires some code reorganization.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ia2b598165d5645c5a44c3d58bea69c7e42f10e41
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53425
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
17 months agoEX-7601 ofd: multiple reads in same chunk
Patrick Farrell [Tue, 12 Dec 2023 04:35:24 +0000 (23:35 -0500)]
EX-7601 ofd: multiple reads in same chunk

When doing DIO or if we get unusual cache behavior on the
client, multiple reads can hit the same chunk.

This only shows up in racing tests, but it's important to
handle.  We do this by making sure we start searching the
lnbs for decompression at the start of the last chunk we
decompressed.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I81fbbba79b16066e6d4519c66030cc58e03d2de7
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53419
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
17 months agoRM-620 build: New tag 2.14.0-ddn127
Andreas Dilger [Tue, 2 Jan 2024 08:26:11 +0000 (01:26 -0700)]
RM-620 build: New tag 2.14.0-ddn127

New tag 2.14.0-ddn127

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I38dd7ae99d4594896d14224de68d6b42e83fde10

17 months agoEX-7601 tgt: objcount in RPC must be 1
Patrick Farrell [Fri, 29 Dec 2023 19:49:55 +0000 (14:49 -0500)]
EX-7601 tgt: objcount in RPC must be 1

Much of the BRW write code assumes objcount is one, but
there is some provision for multiple objects.

Since the code will break if we send it multiple objects,
add errors to make sure anyone changing it will notice.

This isn't strictly compression related, but compression
adds even more code which assumes this, so this protection
will be useful.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Idcbf33fd14d4b1bd179c9516bed07cca907008bc
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52990
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
17 months agoEX-8826 ofd: set compressed file size for fake writes
Patrick Farrell [Sat, 16 Dec 2023 22:40:16 +0000 (17:40 -0500)]
EX-8826 ofd: set compressed file size for fake writes

When using the fake writes fail_loc, file size setting is
done at the ofd layer, since the osd layer isn't used.  So
we must also handle the compressed file size for this case.

This fixes sanity test 399a with compression.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Icda612405908166d043e1e568d0d8bd9cd0c5156
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53483
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
17 months agoEX-7601 ofd: minor debug improvements
Patrick Farrell [Mon, 11 Dec 2023 23:15:44 +0000 (18:15 -0500)]
EX-7601 ofd: minor debug improvements

A smattering of minor debug improvements across several
patches, placed at the end because they're all minor and
some of them would disturb early parts of the series.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I2071911eb09f5c7fad28203db05396bb31ccda59
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53418
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
17 months agoEX-7601 osd: add assert for prepare partial page
Patrick Farrell [Mon, 20 Nov 2023 00:40:27 +0000 (19:40 -0500)]
EX-7601 osd: add assert for prepare partial page

In the write prep code, we read up any partial pages (pages
which are not completely overwritten by the write) to
prepare them for write.  But for compressed files, we will
have already done this to prepare for decompression.

Add an assert to make sure we catch if this is ever wrong.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I1366b1f5b191a4d581448d692933d562198c3a1f
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53179
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
17 months agoEX-7601 ofd: create read mapping for read-modify-write
Patrick Farrell [Sun, 5 Nov 2023 15:55:29 +0000 (10:55 -0500)]
EX-7601 ofd: create read mapping for read-modify-write

When we need to do a read-modify-write for unaligned writes
to a compressed file, it's important we read only the
portion of the file which is receiving unaligned IO.

This patch identifies these chunks in preprw_write and
creates a read lnb mapping from a subset of the pages for
write.  These pages we read up are then decompressed.

Note one issue this patch does not address is reading of
data past EOF.  If the final chunk is unaligned, we will
round the write to cover it.  This results in extending the
file inappropriately, writing zeroes where they aren't
needed.  The read side gives us the info to address this,
which we will do in a future patch.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Iede43f12127cbb93e73c22a915192aa2f814a927
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52997
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
17 months agoEX-7601 ofd: distinguish nr_write and nr_read
Patrick Farrell [Fri, 3 Nov 2023 20:29:51 +0000 (16:29 -0400)]
EX-7601 ofd: distinguish nr_write and nr_read

We will have two counts of pages in lnbs, distinguish
between them.

Not actually used yet - will be calculated when the read
lnb mapping is created.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I709b8fd299163d348a196184152bb0294fcb650b
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52985
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
17 months agoEX-7601 ofd: add read lnb to ofd_preprw_write
Patrick Farrell [Fri, 3 Nov 2023 20:22:22 +0000 (16:22 -0400)]
EX-7601 ofd: add read lnb to ofd_preprw_write

The read phase of read-modify-write for compressed files
needs to read only a subset of the pages which will be
written, so it needs a separate set of lnb pointers for
tracking this subset.

This patch passes around the necessary argument but does
not set up or use the lnb yet.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I7ec7101e65e73a6c9e67cea3c58d8cace38e70e0
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52984
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
17 months agoLU-17370 utils: simplify lfs help text
Alexandre Ioffe [Thu, 21 Dec 2023 06:53:42 +0000 (22:53 -0800)]
LU-17370 utils: simplify lfs help text

Simplify help text for lfs getstripe and lfs setstripe.
Update corresponding man pages lfs-getstripe and lfs-setstripe.
On man pages make left side adjustment and disable hyphenation:
'.nh', '.ad l' to prevent hyphenation of keywords

Lustre-change: https://review.whamcloud.com/53564
Lustre-commit: TBD (from 6c3dae58eddc2e3c7caf35599733b2e59ebeb657)

Signed-off-by: Alexandre Ioffe <aioffe@ddn.com>
Test-Parameters: trivial
Change-Id: Iae9d3534230ee7d325fbeffd78b5c12632a4a161
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53523
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
17 months agoLU-17186 utils: replace gethostby*() with get*info()
Sebastien Buisson [Fri, 8 Dec 2023 19:53:12 +0000 (11:53 -0800)]
LU-17186 utils: replace gethostby*() with get*info()

This patch replaces the deprecated gethostbyname() and
gethostbyaddr() functions with getaddrinfo() and getnameinfo()
functions respectively.

The getaddrinfo() function combines the functionality provided by the
gethostbyname() and getservbyname() functions into a single interface,
but unlike the latter functions, getaddrinfo() is reentrant and allows
programs to eliminate IPv4-versus-IPv6 dependencies.

The getnameinfo() function is the inverse of getaddrinfo(): it
converts a socket address to a corresponding host and service, in a
protocol-independent manner. It combines the functionality of
gethostbyaddr() and getservbyport(), but unlike those functions,
getnameinfo() is reentrant and allows programs to eliminate
IPv4-versus-IPv6 dependencies.

Lustre-change: https://review.whamcloud.com/52632
Lustre-commit: TBD (from 99687573d33336a153c1a5b94a4b66ebbcc2d0f1)

Test-Parameters: kerberos=true testlist=sanity-krb5
Test-Parameters: testgroup=review-dne-selinux-ssk-part-2
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Iacb5583826cd2f7329455bc6cbb4477f9087f15a
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53386
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
17 months agoLU-17261 lov: ignore broken components
Alex Zhuravlev [Sun, 5 Nov 2023 13:51:29 +0000 (16:51 +0300)]
LU-17261 lov: ignore broken components

if some component of a mirrored file is broken, it makes sense
to try another (possible valid) replica rather than give up
immediately.

Lustre-change: https://review.whamcloud.com/52996
Lustre-commit: 902fe290e51dccdee89380fb725ae6e3c1802e2b

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I32ea0efa90109f5159bf8b6c4e0efe1d543580c3
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Zhenyu Xu <bobijam@hotmail.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53542
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
17 months agoRM-620 build: New tag 2.14.0-ddn126
Andreas Dilger [Fri, 29 Dec 2023 11:20:01 +0000 (04:20 -0700)]
RM-620 build: New tag 2.14.0-ddn126

New tag 2.14.0-ddn126

Signed-off-by: Andreas Dilger <adilger@dilger.ca>
Change-Id: I67599c5b0918be8761e0561cc9c60e39f171196f

17 months agoLU-16859 lnet: incorrect check for duplicate NI
Serguei Smirnov [Tue, 31 Oct 2023 21:11:54 +0000 (14:11 -0700)]
LU-16859 lnet: incorrect check for duplicate NI

When NI is being added to an existing LNet, checking against
existing NI interface names currently fails if the new NI
happens to use interface name which is a prefix of one used
by an existing NI.

The following example assumes ib0 and its alias ib0:1 are
configured:

lnetctl net add --net o2ib --if ib0:1
lnetctl net add --net o2ib --if ib0

Fix this by making sure interface strings are compared properly
regardless of relative length.

Lustre-change: https://review.whamcloud.com/52918
Lustre-commit: 7dcdb9eb0ded98e956fe417abbd835433a8de3f0

Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: I0d4047118e7d9982fa791a2e324a27aa5d4abaee
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53527
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
17 months agoLU-16552 test: add new lnet test for Multi-Rail setups
James Simmons [Sun, 13 Aug 2023 15:02:33 +0000 (11:02 -0400)]
LU-16552 test: add new lnet test for Multi-Rail setups

You can crash lnet kernel module by setting up a interface with
lctl net up and then attempting to setup the interface with
the import function. This is due to improper clearing the net_cpts
array.

Currently sanity-lnet.sh doesn't real test MR setups. Because of
this a few bugs slipped in. Add two new test to ensure MR setups
behave properly. Test 107 is to see if deleting a second interface
for a MR setup doesn't crash a node. Test 108 creates a multi rail
setup of a tcp LNet net with two interfaces, one real and the
other fake. A bug was preventing the second fake interface from
being added.

Lustre-change: https://review.whamcloud.com/50302
Lustre-commit: 8785f25b053c69b4303e901c6c8dc5d0d4d6dfc1

Test-Parameters: trivial testlist=sanity-lnet
Change-Id: Ic69e14bd0617f4d6fe931140b5b6d43b795843cf
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53529
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
17 months agoLU-12031 mdt: explicit data version of DoM files
Mikhail Pershin [Mon, 25 Apr 2022 06:13:53 +0000 (09:13 +0300)]
LU-12031 mdt: explicit data version of DoM files

Use EA to store 'data_version' for DoM files explicitly.

Unlike OST objects the 'inode_version' of DoM file is changed
by metadata operations as well and that leads to problems
during HSM operations, e.g. writing HSM EA with file data
version inside causes DoM object version update making this
HSM EA version obsoleted, also any metadata update on
restored file makes it dirty and prevents second release.

DoM files have now explicitly updated 'data_version' in
addition to ordinary 'inode_version'. The 'data_version'
is updated along with 'inode_version' upon write/truncate and
fallocate operations and is stored as 'trusted.dataver' EA.
Layout swap procedure is updated to move data version between
files being swept along with HSM attributes.
If DoM file is migrated to RAID0 file then 'dataver' EA is
deleted.

Corresponding test 1f is added to sanity-hsm.sh and
207j to sanity.sh.

Lustre-change: https://review.whamcloud.com/47139
Lustre-commit: aae3289adb2bbc192870f195b78044484f717e16

Test-Parameters: clientversion=2.12.4 testlist=sanity-hsm
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I4689c56394c7323d32cd6f7dd86f58beb6e53353
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sergey Cheremencev <scherementsev@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53214
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
17 months agoLU-12998 mds: add no_create parameter to stop creates
Andreas Dilger [Sat, 23 Apr 2022 00:10:36 +0000 (18:10 -0600)]
LU-12998 mds: add no_create parameter to stop creates

Add an target tunable parameter and mount option "no_create" to
disable new *directory* creation on an MDT.  This sends the
flag OS_STATFS_NOCREATE to the clients, and the DNE MDT space
balance will avoid selecting that MDT when creating a new
subdirectory, without disabling access to existing files/dirs.

This allows "soft disabling" an MDT in advance of storage
upgrades to minimize new directories and files created on that
MDT, reduce future migration, and/or backup/restore workload.

As yet it does not totally disable *file* creation on the MDT,
but it may be extended to do so in the future.

This is analogous to the "no_precreate" option that was added
on the OSTs, and "no_create" has been added to the OSTs for
consistency ("no_precreate" is kept for compatibility for now).

lod_declare_create() checks whether directory create target MDT is
current MDT, this may happen if nocreate is set on some MDT. Upon
such mismatch, call dt_statfs() to fetch latest statfs to know
whether nocreate is set.

lmv_create() will choose another MDT if target MDT is set with
nocreate, but in case the flag is cleared, call obd_statfs() to fetch
cached statfs and check again.

Lustre-change: https://review.whamcloud.com/47124
Lustre-commit: 1dbcd0bab881fac38d8a5e4ef1559f12618f8f0e
Lustre-change: https://review.whamcloud.com/53437
Lustre-commit: 066262a04cb8e0cbf49a20b7bf036d4484399afe (TBD)

Test-Parameters: testlist=conf-sanity env=ONLY=112b,ONLY_REPEAT=50
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I53cfb48ade2f844b18bfc630e7fcea6de9ce7057
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53189
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
17 months agoLU-17263 utils: 'lfs find -blocks' to use 512-byte units
Andreas Dilger [Sun, 5 Nov 2023 05:32:19 +0000 (23:32 -0600)]
LU-17263 utils: 'lfs find -blocks' to use 512-byte units

Change the default units for 'lfs find -blocks' from 1KiB blocks
to 512-byte blocks to better match the behavior of find(1).  This
also matches what "-printf %b" will print.

Change llapi_parse_size() to accept a 'c' argument to specify
characters, and accept a "B" or "iB" suffix if provided.

Lustre-change: https://review.whamcloud.com/52993
Lustre-commit: 869ea3211d2f15d7c674bc10e5f1a3272e44504e

Fixes: c043f46025 ("LU-10705 utils: add "lfs find --blocks"")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: If8345f15bf53912501cadc0fa7f981a9f787b767
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53522
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
17 months agoLU-17349 tests: sanity-quota_81 decrease timeout
Sergey Cheremencev [Sun, 3 Dec 2023 04:06:23 +0000 (07:06 +0300)]
LU-17349 tests: sanity-quota_81 decrease timeout

Decrease cfs fail timeout in sanity-quota_81 from 30
to 10 seconds to avoid soft lockup.

Lustre-change: https://review.whamcloud.com/53384
Lustre-commit: b58219ef1edebcb266cbe0dfede491ba5de491d1

Fixes: 862f0baa7c21 ("LU-15097 quota: stop pool_recalc before killing pool")
Test-Parameters: trivial testlist=sanity-quota
Signed-off-by: Sergey Cheremencev <scherementsev@ddn.com>
Change-Id: I8630db7b3948b335fef5d5349f960f79cb877fc3
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53516
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
17 months agoLU-17358 lprocfs: make job_stats job_id valid yaml
Nathaniel Clark [Tue, 12 Dec 2023 18:05:22 +0000 (13:05 -0500)]
LU-17358 lprocfs: make job_stats job_id valid yaml

Fix quoting job_id to account for leading '@' being reserved.

Test-Parameters: trivial
Signed-off-by: Nathaniel Clark <nclark@whamcloud.com>
Change-Id: Ifce3edc9b636db2f059ab9960488972a152d2e7a
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53424
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Feng Lei <flei@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53519

17 months agoEX-8780 test: wait osts up after restart
Hongchao Zhang [Sat, 9 Dec 2023 20:43:49 +0000 (04:43 +0800)]
EX-8780 test: wait osts up after restart

In test_18e of sanity-lfsck, the OSTs could not be ready on all MDTs
and the LFSCK status will be incorrect because the LFSCK notify can
not be sent to all OSTs.

Change-Id: If1ed5d920d5c8b99d42f59f92a1e245a9e2a8267
Test-Parameters: trivial testlist=sanity-lfsck,sanity-lfsck,sanity-lfsck,sanity-lfsck,sanity-lfsck
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53531
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
17 months agoLU-17289 test: disable sanity/test_906 temporarily
Qian Yingjin [Thu, 7 Dec 2023 09:45:01 +0000 (04:45 -0500)]
LU-17289 test: disable sanity/test_906 temporarily

On the rhel9.3, the fio io_uring engine testing failed with error
"Operation not permitted" on both local file systems (Ext4 and
xfs) and Lustre:

    "fio: pid=4551, err=1/file:engines/io_uring.c:1047,
    func=io_queue_init, error=Operation not permitted"

This is a generic failure in RHEL9.3.  Thus we disable
sanity/test_906 temporarily until the bug is fixed in RHEL9.3.

Lustre-change: https://review.whamcloud.com/53362
Lustre-commit: TBD (from 0eef4b0818e7a1a42a54333fa713ef660c7e9404)

Test-Parameters: trivial
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I3805b475c5f3d0b62dc6c57c4cd93f2bc1b67b76
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53546
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
17 months agoLU-17000 gss: Fix Out-of-bounds access under svcgssd_proc.c
Arshad Hussain [Wed, 1 Nov 2023 06:50:53 +0000 (12:20 +0530)]
LU-17000 gss: Fix Out-of-bounds access under svcgssd_proc.c

Problem reported by coverity was passing 32bit type and
then dereferencing to larger 64bit under function
handle_channel_request(). This patch address this issue.

Since this is an uapi and to catch corner cases like
kernel modules being updated separately from user tools
RSI_DOWNCALL_MAGIC is also changed from 0x6d6dd62a to
0x6d6dd63a.

This patch also changes 32bit member (sid_hash) of
'struct rsi_downcall_data' to 64bit. Which also requires
changing of wiretest.c and wirecheck.c

Lustre-change: https://review.whamcloud.com/52920
Lustre-commit: 7d764f1f11be144ad26e33aa8cecedc5bb708793

CoverityID: 404758 ("Out-of-bounds access")
Fixes: 4daf43ac3c ("LU-17015 gss: support large kerberos token for rpc sec init")
Test-Parameters: kerberos=true testlist=sanity-krb5
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I8041cd4063f1b1cefdebf5681df426be61820f99
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53440
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
17 months agoLU-14918 osd: don't declare similar ldiskfs writes twice
Alex Zhuravlev [Tue, 7 Dec 2021 08:13:54 +0000 (11:13 +0300)]
LU-14918 osd: don't declare similar ldiskfs writes twice

in some cases (like overstriping) the same operations can be
declared multiple times (new llog records) and this lead to
huge number of credits and performance degradation. we can
avoid this checking for duplicate declarations.
As every declaration would need an allocation, limit the scope
of this checks to transaction likely to be large.

% of "large" transaction in sanity-benchmark, depending on threshold:

  creates < 5 && writes < 5:
  0.58% (mds1) and  2.97% (mds2)

  create < 7 & writes < 7:
  0.58% and 2.4%

  create < 9 & writes  < 9:
  0.6% and 1.85%

  create < 10 & write2 < 10:
  0.0004% and 0.000001%

thus 10 creates or writes is selected as a threshold to enable this
logic.

Lustre-change: https://review.whamcloud.com/45765
Lustre-commit: 9e6225b2e7385cbb7be0474df01075fafc4966d5

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I7c893fe3b95646b4b813b999bc832659dfcf03ad
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53485
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
17 months agoEX-7601 ofd: add decompress_read to ofd_preprw_write
Patrick Farrell [Fri, 3 Nov 2023 18:20:37 +0000 (14:20 -0400)]
EX-7601 ofd: add decompress_read to ofd_preprw_write

We have read up the compressed data from disk, now we must
decompress it so we can rewrite it successfully.

This code still works on the whole lnbs rather than just on
the portion of it which is unaligned.  This is temporary
and will be resolved by a future patch.

With this patch, we have basic read-modify-write support,
so we can re-enable testing.  The next patch adds tests
for read-modify-write.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ib6503c15e9fb3d425a7bc295bcc61b41c089a1f0
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52983
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
17 months agoEX-7601 ofd: add read to write process
Patrick Farrell [Thu, 2 Nov 2023 22:03:15 +0000 (18:03 -0400)]
EX-7601 ofd: add read to write process

This adds a very simple read to the write process, which
just reads up the entire chunk-rounded write range.

This is a first step - the read will eventually be modified
to only read the unaligned portions which must be
decompressed for read-modify-write.  We will create a
special lnb mapping which contains only the pages which must
be read for decompression (similar to the tx lnb mapping).

For now, this read allows us to test decompression without
handling the mapping.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I169ddc2e161094aebdad1a60ec62e9c1d75cd6d8
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52966
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
17 months agoEX-7601 ofd: add chunk rounding to write
Patrick Farrell [Thu, 2 Nov 2023 21:40:38 +0000 (17:40 -0400)]
EX-7601 ofd: add chunk rounding to write

For compressed files, we need to round all niobufs to
chunk size in the write process, so we have buffers for
reading in and rewriting the complete chunks.

dt_bufs_get sets up the local niobuf for the write, so we
round before calling it.

Note this breaks writing to compressed files, which is not
fixed until a few patches later.  For this reason, we
disable the compression tests.  They will be reenabled
shortly - similar to how we handled the read series.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I413aaba9866dd7d6c4463fa620eadf1423379ba1
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52963
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
17 months agoEX-7601 vvp: remove unaligned write restriction
Patrick Farrell [Fri, 3 Nov 2023 16:29:58 +0000 (12:29 -0400)]
EX-7601 vvp: remove unaligned write restriction

This series will resolve the unaligned write issue for
compressed files, so we need to remove the restriction on
unaligned writes in order to test it.

This does not mean unaligned writes are working yet, but we
need to make this change so the subsequent patches can be
tested.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I4abcbdcd18b00718099483c8dfdb9a7aa41c3ce7
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52981
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
17 months agoEX-7601 ofd: switch preprw to chunk_bits
Patrick Farrell [Fri, 3 Nov 2023 16:33:20 +0000 (12:33 -0400)]
EX-7601 ofd: switch preprw to chunk_bits

The compression/decompression code requires chunk_bits
rather than chunk size.  Since we need to call this code
from ofd_preprw_write, we need chunk_bits there.

This modifies the functions so chunk_bits is available
there.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Id98e6d6364eeaaa7753a8aba059387e3e659d2a2
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52982
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
17 months agoEX-7601 tgt: add tx lnb for writes
Patrick Farrell [Thu, 2 Nov 2023 21:55:01 +0000 (17:55 -0400)]
EX-7601 tgt: add tx lnb for writes

With compression, the lnbs used for the disk IO on the
server can contain more data than the client requested,
due to reading up whole chunks for decompression.

This means the client is only going to write data in to
a subset of the lnbs used for io to storage.

We handle this the same way we do for reads:
We create a second set of lnbs just for the transfer, and
point these lnbs at the pages which will actually receive
data from the client.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I5b668547537698309792daf309842866be79f0b6
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52965
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
17 months agoEX-7601 tgt: add remote_pages for writes
Patrick Farrell [Thu, 2 Nov 2023 21:49:00 +0000 (17:49 -0400)]
EX-7601 tgt: add remote_pages for writes

When we round a write to get all of the compressed chunks,
the number of local and the number of remote pages will
differ.  We need to make sure we do the checksum and data
transfer using the number of remote pages, not the number of
local pages.

This patch calculates the number of remote pages and uses it
accordingly.

Note that just like on the read side, this patch doesn't do
anything until we're actually rounding the chunks for IO in
a later patch.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I38256070d68246613ce67b0bfe328f6443a95533
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52964
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
17 months agoEX-7601 ofd: round range locking
Patrick Farrell [Mon, 20 Nov 2023 00:18:37 +0000 (19:18 -0500)]
EX-7601 ofd: round range locking

The range locking in OFD needs to be rounded for
compressed chunks.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I530d7f655a1c09033b1a3668c009072874ab1d18
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53178
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
17 months agoEX-7601 tgt: round write lock to chunk
Patrick Farrell [Thu, 2 Nov 2023 21:31:41 +0000 (17:31 -0400)]
EX-7601 tgt: round write lock to chunk

For unaligned writes, we need to round the write locking to
cover the any leading or trailing chunks.  We do this by
creating a local 'remote niobuf' to describe the rounded
range and doing the locking against that niobuf.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I2bdea620386ad229375647a0e2cc6180c9bd7aa6
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52961
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
17 months agoEX-7601 tgt: identify writes to round
Patrick Farrell [Thu, 2 Nov 2023 21:26:58 +0000 (17:26 -0400)]
EX-7601 tgt: identify writes to round

If the beginning or end of a client write is unaligned, we
must round the locking.  This patch identifies writes where
this is required, the next patch will do the locking.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Iec140c24423a0da478f6d42ff6fc620d7ad3ba4a
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52960
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
17 months agoEX-7601 ofd: clear pages in decompression
Patrick Farrell [Thu, 30 Nov 2023 19:59:07 +0000 (14:59 -0500)]
EX-7601 ofd: clear pages in decompression

Handling writes to compressed files requires a
read-modify-write cycle, which has implications for how we
handle reads.

Consider the case of a file with an 8 KiB write at offset 0,
which is compressed to 4 KiB.  Then there is another 4 KiB
write at offset 16 KiB.

Updating this correctly requires reading the first chunk,
then decompressing it.  However, this read will go past
EOF, because the write has not occurred yet.  The OSD read
code does not fill in these pages, because read past EOF is
not returned to the client (client gets a short read and
does not actually use the pages).

In our case, however, we must use these pages (from 8 KiB
16 KiB).  In the naive version without recompression, we
simply write out 0 - 16 KiB, so we must have zeroes in
those pages, and once we have recompression, we must
compress those pages so we need zeroes in that case too.

So we note if a page has data in it after decompression,
then if it does not, we clear the page.  Note we do NOT set
lnb_rc to 0 when we clear a page, because lnb_rc = 0 is
used to indicate EOF rather than a gap in the file.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: If1d1360185eb087e821167a08e49c9427e29ffc4
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53302
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
17 months agoEX-7601 obd: do not decompress empty lnbs
Patrick Farrell [Tue, 12 Dec 2023 20:54:37 +0000 (15:54 -0500)]
EX-7601 obd: do not decompress empty lnbs

For reads which cross EOF, we may get lnbs with no data in
them (similarly for writes which cross EOF).

For these cases, it's important to only copy from the lnbs
where there is data, and only do decompression on the lnbs
if there's actually data in them.

Modify merge chunk to do this.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I83fefcfa6d1396dcd97fad994334bf29438bb4bf
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53430
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
17 months agoEX-7601 obd: add error check in merge_chunk
Patrick Farrell [Wed, 29 Nov 2023 01:45:43 +0000 (20:45 -0500)]
EX-7601 obd: add error check in merge_chunk

If the lnbs we're trying to merge have an error recorded in
them, then they're not going to be valid input for
decompression, so return an error.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I1bf17131cb65106087eb5e72e2700db30c0cc975
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53274
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
17 months agoEX-7644 mmap: add mmap support for compression
Patrick Farrell [Wed, 25 Oct 2023 16:15:59 +0000 (12:15 -0400)]
EX-7644 mmap: add mmap support for compression

This removes the EOPNOTSUPP for compression with mmap and
adds an mmap sanity test for compression.  This patch
removes all the restrictions for mmap, but we actually only
have unaligned read support right now, so the test is
deliberately simplified to only test reads.

A more complicated version which also tests mmap writes
comes later in the series, once read-modify-write is
supported.

The test tests mmap by copying data at several different
block sizes with several different compression chunk sizes.

Test-Parameters: testlist=sanity-compr env=ONLY="1003",ONLY_REPEAT=10
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I4a37b106831a903d90e8a8871e9a93baac4e201e
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52280
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
17 months agoLU-17317 gss: no cache flush for rsi and rsc
Sebastien Buisson [Tue, 5 Dec 2023 16:02:21 +0000 (17:02 +0100)]
LU-17317 gss: no cache flush for rsi and rsc

RPCSEC init and RPCSEC context caches hold gss-related information
of security contexts established between network peers. These cache
entries are tightly coupled with contexts handled in the sptlrpc layer
so they must not be purged directly. They are inserted into the cache
when sptlrpc security contexts are established, and removed when the
corresponding security contexts are destroyed.

Lustre-change: https://review.whamcloud.com/53377
Lustre-commit: 3615fa4a86be793652d53c94818c5aeb81e2257e

Test-Parameters: trivial
Test-Parameters: kerberos=true testlist=sanity-krb5
Test-Parameters: testgroup=review-dne-selinux-ssk-part-2
Fixes: 4daf43ac3c ("LU-17015 gss: support large kerberos token for rpc sec init")
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I903f75a4b5229286fcaed3e9d96b5eee7f653f15
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53334
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
17 months agoLU-17015 gss: remove legacy sunrpc-cache based gss caches
Sebastien Buisson [Thu, 14 Sep 2023 12:23:07 +0000 (14:23 +0200)]
LU-17015 gss: remove legacy sunrpc-cache based gss caches

Now that GSS caches are based on Lustre's internal upcall cache
mechanism, we can remove the legacy ones based on the sunrpc cache
implementation, as this code is unused.

We can also remove support for updated get_expiry() in Linux 6.3, as
this function is no longer used.

Lustre-change: https://review.whamcloud.com/52376
Lustre-commit: 8665ba238412f407963724413e137b89d5cd384f

Test-Parameters: trivial
Test-Parameters: kerberos=true testlist=sanity-krb5
Test-Parameters: testgroup=review-dne-selinux-ssk-part-2
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I98d8777d225c723ae061ef360011abfc092e09d8
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Xing Huang <hxing@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53443
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
17 months agoLU-17015 gss: avoid request replay
Sebastien Buisson [Fri, 13 Oct 2023 15:19:16 +0000 (17:19 +0200)]
LU-17015 gss: avoid request replay

Lustre's upcall cache has a retry mechanism in case the upcall was
interrupted or failed and we timed out waiting. In this case we do our
best to retry and do the upcall again.
But when the upcall cache is used for GSS contexts, the upcall cannot
be done twice with same data. The GSSAPI implements security measures
that forbids that kind of request replay, to prevent man-in-the-middle
attacks for instance.

Add a new uc_acquire_replay field to struct upcall_cache, so that
upcall cache users can tell if acquire upcall can be replayed.
For identity upcall, this replay is fine. But for GSS contexts we need
to avoid those replays.
And bump upcall cache timeout value from 20s to 30s for GSS context
init requests.

Also add more debug messages to gss code for both client and server
sides, and both kernel and userspace.

Lustre-change: https://review.whamcloud.com/52689
Lustre-commit: d0194a4b5f6efa26d5473c2793b525f5fdb77e67

Test-Parameters: kerberos=true testlist=sanity-krb5
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I56decc83a4f0d21be420e87cb0417826011932af
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Aurelien Degremont <adegremont@nvidia.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53255
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
17 months agoLU-17015 gss: support large kerberos token for rpc sec ctxt
Sebastien Buisson [Thu, 7 Sep 2023 07:33:36 +0000 (09:33 +0200)]
LU-17015 gss: support large kerberos token for rpc sec ctxt

If the current Kerberos setup is using large token, like when PAC
feature is enabled for Kerberos, authentication can fail due to server
side unable to exchange token between kernel and userspace.
This limitation is inherent to the sunrpc cache mechanism, that can
only handle tokens up to PAGE_SIZE.

For RPC sec context phase, use Lustre's upcall cache mechanism
instead of deprecated kernel's sunrpc cache. Note this phase does not
involve a proper upcall, only the downcall part is relevant to
populate the context computed in userspace.

Lustre-change: https://review.whamcloud.com/52305
Lustre-commit: 473a41fec6fb600c9b6e26010d88772f5252d1e1

Test-Parameters: kerberos=true testlist=sanity-krb5
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I94e945a99cab60d5b6a4c40076c40fffede217ab
Reviewed-by: Aurelien Degremont <adegremont@nvidia.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53254
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
17 months agoLU-17317 gss: do not continue using expired reverse context
Sebastien Buisson [Fri, 8 Dec 2023 08:05:04 +0000 (09:05 +0100)]
LU-17317 gss: do not continue using expired reverse context

In case a server uses an expired gss context to send a callback
request to a client, it might be that the associated context on
the client has already expired, and been purged from the cache.
This results in a GSS_S_NO_CONTEXT reply.
In this specific scenario, the server must mark its reverse context
as dead. This will lead to destruction of the expired context, and
creation of a new context suitable for further callback requests.

Lustre-change: https://review.whamcloud.com/53375
Lustre-commit: TBD (65f91673262098aa6d97448f68a036b0f2cdfd98)

Test-Parameters: kerberos=true testlist=sanity-krb5
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I4af90cd70a3815851ec555ea85b49714c8da4202
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53369
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
17 months agoRM-620 build: New tag 2.14.0-ddn125
Andreas Dilger [Wed, 20 Dec 2023 08:55:47 +0000 (01:55 -0700)]
RM-620 build: New tag 2.14.0-ddn125

New tag 2.14.0-ddn125

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ic2e4b53c8540ffd039359565c06294645a62d328

17 months agoLU-13791 mdt: parameter to tune capabilities
Andreas Dilger [Tue, 19 Dec 2023 02:07:58 +0000 (19:07 -0700)]
LU-13791 mdt: parameter to tune capabilities

Add mdt.*.enable_cap_mask to allow specific capabilities to
be enabled and disabled individually.

Fixes: f05edf8e2b ("LU-13791 sec: enable FS capabilities")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I6fc0130a90693d673d8c2158e7e31c2de951553d
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53500
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
17 months agoRM-620 build: New tag 2.14.0-ddn124
Andreas Dilger [Tue, 19 Dec 2023 06:10:54 +0000 (23:10 -0700)]
RM-620 build: New tag 2.14.0-ddn124

New tag 2.14.0-ddn124

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ib274a425044bfbb22bc40bd51ccfda06ad6ba8b0

17 months agoLU-930 docs: fix whatis output
Timothy Day [Sun, 12 Mar 2023 15:19:54 +0000 (15:19 +0000)]
LU-930 docs: fix whatis output

The ".SH NAME" section has to be formatted in a certain
way for whatis and apropos to work correctly. Otherwise,
users will just see "(unknown subject)".

This patch fixes issues for all man pages.

Add a couple of one-line man page redirects.

Lustre-change: https://review.whamcloud.com/50264
Lustre-commit: 17bbf5bdd6f96f61dc0e39924dce540e91e1422c

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Ie11eb921c84ff9ad19b50973c616f6fb6df1f461
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53474
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
17 months agoLU-12837 doc: add lfs-changelog* manpages
Etienne AUJAMES [Tue, 22 Nov 2022 12:39:25 +0000 (13:39 +0100)]
LU-12837 doc: add lfs-changelog* manpages

This patch moves the documentation for "lfs changelog" and "lfs
changelog_clear" utilities from "lfs.1" to the following manpages:
- lfs-changelog.1
- lfs-changelog_clear.1

Lustre-change: https://review.whamcloud.com/49209
Lustre-commit: 82e7ad348c77e5c164aa3e3155c9eb91872369d5

Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Signed-off-by: Xing Huang <hxing@ddn.com>
Test-Parameters: trivial
Change-Id: I6db2e687e506a6116fe4755358a9abbd5509c3bb
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53471
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
17 months agoLU-14651 build: fix build for el7.9 kernels
Andrew Perepechko [Mon, 18 Dec 2023 18:19:26 +0000 (11:19 -0700)]
LU-14651 build: fix build for el7.9 kernels

Handle extra setattr_prepare() argument added in Linux 5.12 kernels
when building on older kernels.

Lustre-change: https://review.whamcloud.com/53503
Lustre-commit: TBD (from cc03199c61df217f7da249d9f9f3419e0333c671)

HPE-bug-id: LUS-12059
Signed-off-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Change-Id: Ie7fd1c4d51b7a9b086cfca0db941321cbcce7057
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53494
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
18 months agoRM-620 build: New tag 2.14.0-ddn123
Andreas Dilger [Fri, 15 Dec 2023 03:52:43 +0000 (20:52 -0700)]
RM-620 build: New tag 2.14.0-ddn123

New tag 2.14.0-ddn123

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I33af32d0a44376aee90286496939c4bcb114abd8

18 months agoLU-17366 kernel: update SLES15 SP5 [5.14.21-150500.55.39.1]
Jian Yu [Thu, 14 Dec 2023 19:38:42 +0000 (11:38 -0800)]
LU-17366 kernel: update SLES15 SP5 [5.14.21-150500.55.39.1]

Update SLES15 SP5 kernel to 5.14.21-150500.55.39.1 for Lustre client.

Lustre-change: https://review.whamcloud.com/53467
Lustre-commit: TBD (from 7084f80ec256f6a7335fe4d5981db1e8bcbed440)

Test-Parameters: trivial mdtcount=4 mdscount=2 \
clientdistro=sles15sp5 testlist=sanity

Change-Id: Id9476e8726728b00d4079cdaf31b081f89190eb1
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53468
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
18 months agoEX-8779 build: kernel-abi-whitelists is not required
Jian Yu [Thu, 14 Dec 2023 17:07:48 +0000 (09:07 -0800)]
EX-8779 build: kernel-abi-whitelists is not required

This patch fixes build dependency issue with
kernel-abi-whitelists, which is not required.

Test-Parameters: trivial

Change-Id: I3f8ad51a0ccab5c994d472d62934670b497c1454
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53448
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
18 months agoRM-620 build: New tag 2.14.0-ddn122
Andreas Dilger [Thu, 14 Dec 2023 14:04:42 +0000 (07:04 -0700)]
RM-620 build: New tag 2.14.0-ddn122

New tag 2.14.0-ddn122

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I95bfa9740603cb10e34dfa347c9112d67c29764a

18 months agoLU-16456 tests: skip conf-sanity test_129 in interop
Andreas Dilger [Thu, 14 Dec 2023 03:14:00 +0000 (20:14 -0700)]
LU-16456 tests: skip conf-sanity test_129 in interop

test_129 was added in commit v2_14_56-40-gcefabee52
It should be skipped for older MDS versions.

Lustre-change: https://review.whamcloud.com/49601
Lustre-commit: 7e566c6a1f9d5324718ebc7149153f3272363b9c

Test-Parameters: trivial testlist=conf-sanity env=ONLY=129 serverversion=EXA6.2.0
Fixes: cefabee52 ("LU-15112 mgc: do not ignore target registration failure")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: If1e276c816ecf2f30dc970f9b5afe85d722540e5
Reviewed-by: Sergey Cheremencev <scherementsev@ddn.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Sarah Liu <sarah@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53452
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
18 months agoEX-7601 tests: unaligned read tests
Patrick Farrell [Tue, 12 Dec 2023 15:00:41 +0000 (10:00 -0500)]
EX-7601 tests: unaligned read tests

This adds testing for handling unaligned reads and partial
chunk reads from compressed files.

Testing for writes and multi-client and racing tests will
be added later, but we put the checking function in
test-framework now so it's easy to use later.

Test-Parameters: trivial
Test-Parameters: testlist=sanity-compr env=ONLY="1001 1002",ONLY_REPEAT=10
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Signed-off-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Change-Id: I06217c8aeba75016aa4168f329026842dff1d979
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51841
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
18 months agoLU-17265 tests: allow margin for sanity/39r
Arshad Hussain [Wed, 8 Nov 2023 06:38:07 +0000 (12:08 +0530)]
LU-17265 tests: allow margin for sanity/39r

The timestamp may be little outdated due to a gap between
writing a file and checking the timestamp, so take that into
consideration and allow 2 second leniency when comparing
timestamps.

The on-disk inode may also not be flushed from the journal
immediately, so allow some time for it to be updated.

This patch also converts the hex value read via debugfs
to decimal.

Lustre-change: https://review.whamcloud.com/53035
Lustre-commit: c5aa16db172afc9cbf0d4fd2c85261fef1a40d7b

Test-Parameters: trivial testlist=sanity env=ONLY=39r,ONLY_REPEAT=100
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I9e765f9cd572fb25821f9a0401c34209b7c3f574
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53453
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
18 months agoLU-17360 kernel: update RHEL 9.3 [5.14.0-362.13.1.el9_3]
Jian Yu [Wed, 13 Dec 2023 08:32:22 +0000 (00:32 -0800)]
LU-17360 kernel: update RHEL 9.3 [5.14.0-362.13.1.el9_3]

Update RHEL 9.3 kernel to 5.14.0-362.13.1.el9_3 for Lustre client.

Lustre-change: https://review.whamcloud.com/53433
Lustre-commit: TBD (from 3662949bcd342a96f8dddcb6663872e870f9871b)

Test-Parameters: trivial env=SANITY_EXCEPT="906" \
  mdtcount=4 mdscount=2 clientdistro=el9.3 testlist=sanity
Test-Parameters: optional clientdistro=el9.3 testgroup=full-part-1
Test-Parameters: optional clientdistro=el9.3 testgroup=full-part-2
Test-Parameters: optional clientdistro=el9.3 testgroup=full-part-3

Change-Id: I35863d298a612d7913d39f9031e792808f204ad4
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53435
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
18 months agoLU-16397 test: check quota setting on QSD
Hongchao Zhang [Tue, 12 Dec 2023 10:37:22 +0000 (18:37 +0800)]
LU-16397 test: check quota setting on QSD

In some case, the quota setting at QMT could not be transfered to
QSD in time, which could cause the test to fail.
This patch adds check on QSD after setting the quota limit by LFS.

Lustre-change: https://review.whamcloud.com/49533/
Lustre-commit: TBD (from 76a7ad75740639b9255c51277ff65ce261379af6)

Test-Parameters: trivial testlist=sanity-quota
Change-Id: Ia999317a36a0f97c1f66726cdc10e9edac3d8a53
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53402
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
18 months agoLU-17057 tests: Fix sanity-sec/0
Arshad Hussain [Tue, 21 Nov 2023 15:10:51 +0000 (20:40 +0530)]
LU-17057 tests: Fix sanity-sec/0

Command executed through 'runas' on failure breaks
out of running test script. While this failure is
expected. The setting of 'set -e' forces the pipeline
to exit the running script immediately. This patch
fixes this by checking the return value and then
taking the appropriate action.

This patch also fixes 'touch' command to file f4 by
correctly calling it via uid and gid as it was set
few lines above.

Lustre-change: https://review.whamcloud.com/53194
Lustre-commit: 0b5e252d973e00200660a81f1cdb440f8f4f1886

Test-Parameters: trivial testlist=sanity-sec env=ONLY=0,ONLY_REPEAT=100
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I06e6d22840e31add8c24cf90c31b98464d580ae7
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53439
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
18 months agoLU-17203 libcfs: ignore remaining items
Alex Zhuravlev [Tue, 17 Oct 2023 11:48:58 +0000 (14:48 +0300)]
LU-17203 libcfs: ignore remaining items

remove the assertion checking libcfs hashtable for emptiness
in cfs_hash_for_each_empty(). the only user of this hashtable
is per-export ldlm locks set. in this case it's legal that
some locks can't be removed from the hashtable being in the
process of enqueuing. the hashtable is destroyed from the
export destroy function which in turn is called only when all
RPCs on this export are done (exp_rpc_count==0).

Lustre-change: https://review.whamcloud.com/52726
Lustre-commit: f2f8b6deaf54f1a264b31b44f6cf875fa1629ab2

Fixes: 306a9b666e ("LU-16272 libcfs: cfs_hash_for_each_empty optimization")
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I2b853b017bb7247a0c60cc8f464c2e08d649f0eb
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53404
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
18 months agoLU-16272 libcfs: cfs_hash_for_each_empty optimization
Alexander Zarochentsev [Thu, 20 Oct 2022 19:23:39 +0000 (22:23 +0300)]
LU-16272 libcfs: cfs_hash_for_each_empty optimization

Restarts from bucket 0 in cfs_hash_for_each_empty()
cause excessive cpu consumption while checking first empty
buckets.

Lustre-change: https://review.whamcloud.com/48972
Lustre-commit: 306a9b666e5ea2882f704d93483355e7e147544f

HPE-bug-id: LUS-11311
Signed-off-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Change-Id: Ic03875ea25101052468213043128912ac46daf32
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53379
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
18 months agoLU-17015 sec: fix PTLRPC_CTX_STATUS_MASK
Sebastien Buisson [Wed, 11 Oct 2023 13:29:46 +0000 (15:29 +0200)]
LU-17015 sec: fix PTLRPC_CTX_STATUS_MASK

PTLRPC_CTX_STATUS_MASK should not include PTLRPC_CTX_NEW_BIT, which is
a bit index and not a value. Also, according to code in
sptlrpc_req_refresh_ctx():
if (unlikely(test_bit(PTLRPC_CTX_NEW_BIT, &ctx->cc_flags))) {
   if (ctx->cc_ops->refresh)
      ctx->cc_ops->refresh(ctx);
}
a context needs to be refreshed if it has the PTLRPC_CTX_NEW_BIT bit.
So the function to check if context is refreshed, cli_ctx_is_refreshed
should not return true if the PTLRPC_CTX_NEW_BIT bit is set.

In the end, do not replace PTLRPC_CTX_NEW_BIT with anything else in
PTLRPC_CTX_STATUS_MASK. Having PTLRPC_CTX_NEW_BIT was a no-op (bitwise
OR with 0), but this was working as expected. Just cleanup the code to
avoid headaches.

Lustre-change: https://review.whamcloud.com/52629
Lustre-commit: c744221a1fd55df33ca2b0e3e1b1ffd7ef3a986d

Test-Parameters: trivial
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Ibc2ca9dfaa176b098080f7f2867338b62953b50e
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53441
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
18 months agoLU-14104 tests: sanity/123* shouldn't fail performance checks
Alex Zhuravlev [Mon, 2 Nov 2020 07:13:44 +0000 (10:13 +0300)]
LU-14104 tests: sanity/123* shouldn't fail performance checks

running in VMs as CPU resource isn't strictly guaranteed usually.

Lustre-change: https://review.whamcloud.com/40512
Lustre-commit: b1915f13e3b69c72e3e4c1f2a32d022b6a20d347

Test-Parameters: trivial
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Ieec4a89b921f7ccc198eb10513d4980ad3a20b51
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53456
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
18 months agoLU-8151 obd: Show correct shadow mountpoints for server
Arshad Hussain [Thu, 14 Dec 2023 08:04:25 +0000 (00:04 -0800)]
LU-8151 obd: Show correct shadow mountpoints for server

server_fill_super_common() preps the server for mounting
and forces "Read only" (SB_RDONLY) flag to restrict IO on
the server. This when running the mount command reflects
FS always as "ro" although they are "rw"

This patch double checks the obd statfs (FS) state for
"read only" flag (OS_STATFS_READONLY) and if not found
to be really "read only" toggles (removes) SB_RDONLY flag.

The client output remains unchanged.

Output before patch:
/dev/.../mds1_flakey on /mnt/lustre-mds1 type lustre (ro,svname=...)
/dev/.../ost1_flakey on /mnt/lustre-ost1 type lustre (ro,svname=...)

Output after patch:
/dev/.../mds1_flakey on /mnt/lustre-mds1 type lustre (rw,svname=...)
/dev/.../ost1_flakey on /mnt/lustre-ost1 type lustre (rw,svname=...)

Test case conf-sanity/113 added.

Lustre-change: https://review.whamcloud.com/47131
Lustre-commit: 0171801df517988b0eb1023378c2c8c07a0a36f1

Test-Parameters: trivial testlist=conf-sanity
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: Ie92a686ae97dd62885f415b453bad6bdc0ed3d28
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53445
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
18 months agoLU-17347 debs: also move .ddeb files into debs/
Aurelien Degremont [Fri, 8 Dec 2023 12:34:09 +0000 (13:34 +0100)]
LU-17347 debs: also move .ddeb files into debs/

When building debian packages, the resulting packages are
moved into a 'debs/' subdir.

Don't miss the debug symbol packages 'dbgsym', which are
suffixed .ddeb.

Also add .buildinfo file.

Lustre-change: https://review.whamcloud.com/53378/
Lustre-commit: TBD

Test-Parameters: trivial
Change-Id: I52d0bddfaafc67c4a2a2dbc786d7f320c0b979f8
Signed-off-by: Aurelien Degremont <adegremont@nvidia.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53421
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
18 months agoLU-17336 gss: fix __user pointer in rsi_upcall_seq_write
Sebastien Buisson [Wed, 6 Dec 2023 08:15:18 +0000 (09:15 +0100)]
LU-17336 gss: fix __user pointer in rsi_upcall_seq_write

rsi_upcall_seq_write() uses sscanf to get the string passed from
userspace, but this needs to be copied to a kernel buffer first.

Lustre-change: https://review.whamcloud.com/53342
Lustre-commit: TBD (from 523ffed1cb43eec5fac38c144967026308da9cad)

Test-Parameters: trivial
Test-Parameters: kerberos=true testlist=sanity-krb5
Test-Parameters: testgroup=review-dne-selinux-ssk-part-2
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I2ec875b7c6c158695857fe912ec1dd9f41ddc25d
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53434
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
18 months agoRM-620 build: New tag 2.14.0-ddn121
Andreas Dilger [Tue, 12 Dec 2023 05:53:22 +0000 (22:53 -0700)]
RM-620 build: New tag 2.14.0-ddn121

New tag 2.14.0-ddn121

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I9925c0d0e75c01a5e0a97bf34f8386efa2da8fbe

18 months agoRM-620 build: New tag lipe-2.38
Andreas Dilger [Tue, 12 Dec 2023 05:53:04 +0000 (22:53 -0700)]
RM-620 build: New tag lipe-2.38

New tag lipe-2.38

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I62e6034c466f566015d82da9b9a3a4e4c50cc4fb

18 months agoEX-8590 lipe: Use SSH poll to read stdout/err unblocking
Alexandre Ioffe [Sat, 25 Nov 2023 08:36:42 +0000 (00:36 -0800)]
EX-8590 lipe: Use SSH poll to read stdout/err unblocking

Limit to use only one client machine for hot-pools tests 75*
Fix skip condition for tests 75a,b,c when bandwidth limit
options are not available.
Use ssh poll and unblocking read to read stdout/err in loop
to prevent losing the output when it is not ready.

Test-Parameters: trivial testlist=hot-pools
Test-Parameters: testlist=hot-pools env=ONLY=75a,ONLY_REPEAT=82
Test-Parameters: testlist=hot-pools env=ONLY=75b,ONLY_REPEAT=82
Signed-off-by: Alexandre Ioffe <aioffe@ddn.com>
Change-Id: Ibe07cdd51197c1f3c048b7fcdab6caff850067e7
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53288
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
18 months agoLU-17338 kernel: update RHEL 8.9 [4.18.0-513.9.1.el8_9]
Jian Yu [Thu, 7 Dec 2023 07:45:47 +0000 (23:45 -0800)]
LU-17338 kernel: update RHEL 8.9 [4.18.0-513.9.1.el8_9]

Update RHEL 8.9 kernel to 4.18.0-513.9.1.el8_9 for Lustre client.

Lustre-change: https://review.whamcloud.com/53357
Lustre-commit: TBD (from 5574088906d813c8a17237edc85e55c5d54f10f5)

Test-Parameters: trivial mdtcount=4 mdscount=2 \
clientdistro=el8.9 serverdistro=el8.8 testlist=sanity

Test-Parameters: optional clientdistro=el8.9 serverdistro=el8.8 \
testgroup=full-part-1

Test-Parameters: optional clientdistro=el8.9 serverdistro=el8.8 \
testgroup=full-part-2

Test-Parameters: optional clientdistro=el8.9 serverdistro=el8.8 \
testgroup=full-part-3

Change-Id: Ied0d2873974a3c8cc6e346373457c8ebc09740d6
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53360
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
18 months agoLU-17307 mdt: get dirent count by request
Lai Siyao [Sat, 4 Nov 2023 13:32:59 +0000 (09:32 -0400)]
LU-17307 mdt: get dirent count by request

Add MA_DIRENT_CNT/LA_DIRENT_CNT to notify osd to get dirent count.
Set it in mdt_getattr_name_lock() and when auto-split is enabled so it
won't cause overhead when auto-split is disabled, and change
oo_dirent_count type to atomic_t so the result does not become
inaccurate over time from repeated addition/removal (which may
be used to know whether directory is empty or compare directories in
the future).

In osd_dirent_count() set oo_dirent_count to 0 before iteration to
avoid multiple threads iterate at the same time, which means the
result may not be accurate in this case, but it will be eventually.

Lustre-change: https://review.whamcloud.com/53229
Lustre-commit: TBD (from 50080036674faecfe8a94ebcbb0bdbdbeddac53d)

Fixes: 03a4431dac ("LU-11025 osd: osd_attr_get() returns dirent count")
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I2be6c0dcfda1c98995a269585c5d8d781a8a3b42
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53275
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
18 months agoEX-8236 pcc: abort data copy when clear PCC backend
Qian Yingjin [Fri, 8 Dec 2023 09:15:17 +0000 (04:15 -0500)]
EX-8236 pcc: abort data copy when clear PCC backend

This patch adds an option "--abort" for "lctl pcc del|clear"
command tools.
With this option, the user will first set ATTACH_ABORTING flag on
all in-progress attaching files, and then wait for them to abort
the attache when remove a PCC backend from a client.

Add sanity-pcc/test_108 to verify it.

Change-Id: I4e2f3ec8866e9af45f4524a9f45ee418ef4cb5be
Signed-off-by: Qian Yingjin <qian@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53373
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
18 months agoEX-8778 tests: clear "trace" in quota_fini
Sergey Cheremencev [Sun, 3 Dec 2023 04:11:29 +0000 (07:11 +0300)]
EX-8778 tests: clear "trace" in quota_fini

Clear trace debug level in quota_fini.

Fixes: ba4d37b9fc ("LU-13055 libcfs: allow comma-separated masks")
Test-Parameters: trivial testlist=sanity-quota
Signed-off-by: Sergey Cheremencev <scherementsev@ddn.com>
Change-Id: I480b9975bbf99403cedbfd18154f365ebf181c09
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53385
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
18 months agoLU-17212 gss: survive improper obd or imp at ctx init
Sebastien Buisson [Thu, 19 Oct 2023 09:11:48 +0000 (11:11 +0200)]
LU-17212 gss: survive improper obd or imp at ctx init

GSS context init requests can happen even after a client has been
unmounted, because they are coming from userspace (request-key,
lgss_keyring).
In this case they must be ignored, and code must be robust to survive
improper, already or partially shutdown obd device or import.

Lustre-change: https://review.whamcloud.com/52755
Lustre-commit: 3fcddf6dcdd92df6557c59913a61944f21d58615

Test-Parameters: kerberos=true testlist=sanity-krb5
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I541727165eadf1fcb7715e416da85d100976cf2f
Reviewed-by: Aurelien Degremont <adegremont@nvidia.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53291
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
18 months agoLU-17306 ofd: return error for reconnection
Alexander Boyko [Thu, 16 Nov 2023 22:57:24 +0000 (17:57 -0500)]
LU-17306 ofd: return error for reconnection

During the cleanup orphan phase, reconnection leads to unsynchronized
last id between MDT and OST. This means that MDT could assign non
existing objects to a client for a file create operation.

ofd_create_hdl()) capstor-OST0087: dropping old orphan cleanup request
MDS LAST_ID [0x2540000400:0xb6941:0x0] (747841) is 352 behind OST
    LAST_ID [0x2540000400:0xb6aa1:0x0] (748193), trust the OST

recovery-small 144c reproduce bug where MDT lost synchronization
with OST.

Lustre-change: https://review.whamcloud.com/53195
Lustre-commit: TBD (from 1f0deff150a3087a974adbac687a5019f6c0e39d)

Fixes: 63e17799a3 ("LU-8367 osp: enable replay for precreation request")
HPE-bug-id: LUS-11969
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: I22c3d3b3db2acc9ad8f1b978b234afe7d3eef51d
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53341
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
18 months agoEX-7601 tgt: reorder tgt_brw_write decls
Patrick Farrell [Thu, 2 Nov 2023 21:23:01 +0000 (17:23 -0400)]
EX-7601 tgt: reorder tgt_brw_write decls

Reorder the declarations in tgt_brw_write.

This patch also serves as the series head for implementing
read-modify-write support for compressed chunks.

The process for read-modify-write is similar to that used
for unaligned reads.

At a high level, read-modify-write means we must read up,
decompress, then recompress and write back the data.  This
only applies when we're actually doing read-modify-write.

To know when to do this, we rely partly on the client.  If
the client is able to compress a chunk, either because it is
a complete chunk, or because the start is chunk aligned and
the write is past EOF, we know there is no read-modify-write
required.  Either there is no existing data (write past EOF)
or the data will be fully replaced.

So, when we see a write which is not fully chunk aligned and
not already compressed, we will do a read-modify-write.

For this, we round the IO lnbs and associated locking to
cover complete chunks, then we do a read of the unaligned
chunks.

ie, if we have a write which goes from 63 KiB to 257 KiB
with a chunk size of 64 KiB, we will read 0-64 KiB and
256-320 KiB, and decompress those chunks in to the buffer.
64 KiB to 256 KiB is *NOT* read, because those are complete
chunks.

We then set up a transfer mapping - identical to the process
for unaligned reads - so the client data is written in to
the correct lnbs.

Now we have a set of chunk aligned lnbs which contain data
updated with the client write.  In the initial version, we
write these to disk uncompressed.  This is sufficient for
correct operation, but it does mean read-modify-write will
decompress those chunks.

There is code for recompression, but it is not working 100%
yet, and there are some complexities around managing holes
and EOF which still need to be resolved.

TBD if this will make our initial release - I am hopeful but
not sure yet.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ia24583d4221f498928e99afa8c289b70e4d25f5b
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52959
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
18 months agoEX-7601 ofd: improve decompress_rnb debug
Patrick Farrell [Mon, 11 Dec 2023 15:49:19 +0000 (10:49 -0500)]
EX-7601 ofd: improve decompress_rnb debug

Since we're very close on landing the unaligned read
patches, this minor debug improvement is being placed later
in the series.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I36ad243bd1f7025e358f9593f1008f0b851cc1bb
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53411
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
18 months agoEX-7601 ofd: add decompress_rnb implementation
Patrick Farrell [Tue, 31 Oct 2023 20:11:41 +0000 (16:11 -0400)]
EX-7601 ofd: add decompress_rnb implementation

This implements decompress_rnb, which is the core code for
handling unaligned reads from the client.

Decompress rnb takes an unaligned remote niobuf and
identifies the unaligned portion(s) of the IO, then finds
the corresponding local niobufs (pages read from disk),
and passes them on for decompression in place.

decompress_chunk_in_lnb decompresses the data in a set of
lnbs and copies it back to the same location, replacing the
raw data from disk with decompressed data.  (If the chunk
was not compressed, it does nothing.)

With this patch, the implementation of unaligned reads is
complete and we can add the compression sanity tests back
safely.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ifd1d9b03d5d004bec3f5e456da359b8d10e005f9
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52916
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
18 months agoEX-7601 ofd: lock pages from read to decompression
Patrick Farrell [Tue, 7 Nov 2023 21:35:51 +0000 (16:35 -0500)]
EX-7601 ofd: lock pages from read to decompression

When using the page cache on the server, for pages which
will be decompressed, we can't unlock them until they've
been decompressed.

Rather than only waiting to unlock the pages which will be
decompressed, we keep all of the read pages locked.  This
simplifies the code, at the cost of delaying other reads to
the aligned portion of an unaligned read.  ie, shouldn't be
important in practice.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ie98920327979a5c9600e8c9e8627816461ea1a34
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53026
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
18 months agoEX-7601 ofd: add ofd_decompress_read implementation
Patrick Farrell [Fri, 27 Oct 2023 20:50:30 +0000 (16:50 -0400)]
EX-7601 ofd: add ofd_decompress_read implementation

ofd_decompress_read is responsible for walking the
remote niobufs (rnbs) in the RPC and identifying if they
are chunk unaligned.  It then passes them on to the rnb
decompression code (not implemented yet, see next patch).

It also allocates the bounce buffers for decompression so
they can be reused for each remote niobuf.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I1f2f86ce3fc036ac5d79b060a5e44f6564e123aa
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52868
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
18 months agoEX-7601 ofd: do not use page cache for compressed files
Patrick Farrell [Wed, 6 Dec 2023 15:55:24 +0000 (10:55 -0500)]
EX-7601 ofd: do not use page cache for compressed files

It is challenging for the server to safely use the page
cache with compressed files, because if data is
decompressed in to the page cache, the data in cache now
differs from the data on disk.

This is a problem if *part* of the page cache is ever
evicted, because we can end up with a situation where a read
will be partially satisfied from cache and partially from
disk, but the data on disk is compressed and the data in
cache is not.

It is possible to deal with this by carefully ensuring the
page cache is not used just for decompressed data, but this
makes getting the buffers/lnbs for compressed files fairly
complicated.  Instead, we can just entirely block using the
server page cache for compressed files.

This must be done for both read and write, and only works
for ldiskfs - ZFS cannot easily be forced to not use its
page cache.  But that's OK because we do not support CSDC
with ZFS.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Iee73abb29ad5631bb2203c2133756d7ebf5b686d
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53348
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
18 months agoEX-7601 ofd: add chunk_size to preprw_write
Patrick Farrell [Thu, 2 Nov 2023 21:37:43 +0000 (17:37 -0400)]
EX-7601 ofd: add chunk_size to preprw_write

preprw_write needs chunk size for rounding.  Add this in a
separate patch to keep things trivial, it will be used in
a subsequent patch.

This patch is really trivial on the write side, since the
read side already did most of this.  But it's being kept
separate for symmetry.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Id25957dbc185b6e61b7f208cee8cf5f897f03944
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52962
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
18 months agoEX-7601 ofd: add chunk rounding to read
Patrick Farrell [Fri, 27 Oct 2023 20:24:33 +0000 (16:24 -0400)]
EX-7601 ofd: add chunk rounding to read

We need to round all niobufs to chunk size in the read
process, so we read in the full chunk.

dt_bufs_get sets up the local niobuf for the read, so we
round before calling it.

This patch is a partial implementation of unaligned read
support, and breaks compression testing until the next few
patches are landed.  So this patch temporarily adds the
compression tests to ALWAYS_EXCEPT.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I587a519db4dae983db5db1d690e63e15bc010b7e
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52867
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
18 months agoEX-7601 tgt: add io_lnb_to_tx_lnb
Patrick Farrell [Mon, 30 Oct 2023 03:39:53 +0000 (23:39 -0400)]
EX-7601 tgt: add io_lnb_to_tx_lnb

With compression, the lnbs used for the disk IO on the
server can contain more data than the client requested,
due to reading up whole chunks for decompression.

This means we need to transfer only a subset of the lnbs.
We do this by creating a second set of lnbs, and pointing
them at the pages in the local io lnb which need to be
transferred to the client.

This code doesn't do anything for now, but it will kick in
with the next patch when we start rounding chunks for read.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I0fe690718a3484578b139eaaec52c0c3b265da6a
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52884
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
18 months agoEX-7601 tgt: add second tbc lnb for RDMA
Patrick Farrell [Sun, 29 Oct 2023 02:16:01 +0000 (22:16 -0400)]
EX-7601 tgt: add second tbc lnb for RDMA

Compression requires the server to do local IO which differs
from the IO requested by the client.  This means we cannot
directly use the IO niobufs for doing the transfer to the
client.

So we add a second set of lnb pointers, which are used to
point at a specific subset of the pages in the main
per-thread cache.  This subset will be used for doing the
transfer to the client.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I53aa46045aaf335da20a311900ac0bf425823b22
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52881
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
18 months agoRM-620 build: New tag 2.14.0-ddn120
Andreas Dilger [Thu, 7 Dec 2023 11:13:42 +0000 (04:13 -0700)]
RM-620 build: New tag 2.14.0-ddn120

New tag 2.14.0-ddn120

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I59aeafd089ff479f3ff735a04a805ec99ecadfdb