Whamcloud - gitweb
Patrick Farrell [Wed, 20 Sep 2023 18:43:51 +0000 (14:43 -0400)]
EX-8270 ptlrpc: convert to void
Convert functions without meaningful return to void.
Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I81f0baefd5b77b60ba699fa8749eaa83acadd8dd
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52438
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Patrick Farrell [Wed, 20 Sep 2023 18:38:58 +0000 (14:38 -0400)]
EX-8270 ptlrpc: stop passing around pool_index
We pass pool_index around from function to function over
and over, but it's easier to just pass the pool around.
This does require the pool to know its own index, but
that seems better anyway.
Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I42dc8b8094212c69b7a29cc3766bd0a10860f7af
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52437
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Patrick Farrell [Wed, 20 Sep 2023 18:17:01 +0000 (14:17 -0400)]
EX-8270 ptlrpc: reduce usage of pool_index
The pool index is used over and over a lot of places where
we should just use it once.
Note the printing functions are deliberately not combined
to maximum length lines for ease of reading.
Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I7efbf491cf28f6fd16d06f5bbc42d714c908f34c
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52436
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Patrick Farrell [Wed, 20 Sep 2023 17:57:29 +0000 (13:57 -0400)]
EX-8270 ptlrpc: correct use of plural 'pools'
There are a bunch of spots which refer to a single pool by
pool index, but which say 'pools'. This is very confusing,
and in fact led to me misunderstanding the code at least
once.
Clean that up.
Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I9eabcfe77a57a82c87b36e3b3e040be91671fbfb
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52435
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Patrick Farrell [Wed, 20 Sep 2023 17:52:16 +0000 (13:52 -0400)]
EX-8270 ptlrpc: simplify pools_should_grow
This patch is a prelude to replacing "pools_should_grow()"
with a "grow_pool" function. (The odd plural will be
removed shortly.)
Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I0accbce2c36fa97684fbee364057b8ff2f9ae12d
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52434
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Patrick Farrell [Wed, 20 Sep 2023 17:40:22 +0000 (13:40 -0400)]
EX-8270 ptlrpc: improve use of 'count'
This is a first trivial step towards fixing usage of
'count' in the page pools code. (And a whitespace fix.)
Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ic4f74db74b8cec63572d5fd5b129f861ab0cba7c
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52433
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Patrick Farrell [Wed, 20 Sep 2023 17:36:31 +0000 (13:36 -0400)]
EX-8270 ptlrpc: remove more uses of 'enc'
Remove a few more uses of 'enc' and note some we aren't
changing.
Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Iaaf6c23ea295b22ded2e8942227ebd5ce4d34e13
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52432
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Patrick Farrell [Sat, 16 Sep 2023 04:04:26 +0000 (00:04 -0400)]
EX-8270 ptlrpc: rename 'epp' to 'ppp'
Finish removing 'encryption' from page pool names except
for the module parameter, which is exposed in configuration
and so can't be changed.
Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I1c14f6cf8cf1a19d89b5a7787aac1b67203866d3
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52431
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Patrick Farrell [Sat, 16 Sep 2023 03:58:38 +0000 (23:58 -0400)]
EX-8270 ptlrpc: start removing 'enc' from pool
Pools are no longer encryption page pools, start renaming
them accordingly. (The 'epp' naming in the struct has been
left for the next patch.)
Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Iba3c98641e24173d95bf8bcf0df2424bbabf3ef9
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52430
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Patrick Farrell [Sat, 16 Sep 2023 03:48:15 +0000 (23:48 -0400)]
EX-8270 ptlrpc: improve usage of PAGES_POOL
PAGES_POOL isn't always used when it should be, let's
improve that a bit (and start renaming a function).
Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ifed59db63d15d61d15712e6df6b8dbae56f2f5b7
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52429
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Patrick Farrell [Sat, 16 Sep 2023 03:36:19 +0000 (23:36 -0400)]
EX-8270 ptlrpc: rename get_buf to get_pages
The sptlrpc_enc_pool_get_buf function actually gets a fixed
number of pages, which is sort of a buffer, but is better
understood as a set of pages.
Rename the function for getting pages for a ptlrpc desc so
we can give get_buf a more appropriate name.
Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I9c03b9d638e7df7f09bf5724c5a6896b7d1e7b6c
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52428
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Patrick Farrell [Sat, 16 Sep 2023 03:24:51 +0000 (23:24 -0400)]
EX-8270 ptlrpc: rename 'size_bits' to 'order'
The kernel uses 'order' to refer to page allocations of a
certain 'order', meaning 2^order pages.
That's what our 'size bits' is - an allocation of a certain
'order'.
Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I38b184239814a0f692b644566075c798ed16f816
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52427
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Patrick Farrell [Fri, 15 Sep 2023 17:36:20 +0000 (13:36 -0400)]
EX-8270 ptlrpc: rename 'pool' to 'pool_idx'
'pool' here is the index of the pool, not the pool itself.
Let's give it a name that makes clear it's a number and not
the actual pool.
Also remove an error condition which is asserted on
immediately before.
Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I636a4756a033d0b96a4772b8912f61c4b31b9c64
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52426
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Patrick Farrell [Fri, 15 Sep 2023 17:30:04 +0000 (13:30 -0400)]
EX-8270 osc: minor compression cleanups
This cleans up some style and argument issues I found made
the code a little harder to follow.
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ia3492ae79acf6c83d724cc91b0201c7872325853
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52425
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Jian Yu [Wed, 20 Sep 2023 19:31:44 +0000 (12:31 -0700)]
LU-17132 kernel: update RHEL 8.8 [4.18.0-477.27.1.el8_8]
Update RHEL 8.8 kernel to 4.18.0-477.27.1.el8_8.
Lustre-change: https://review.whamcloud.com/52422
Lustre-commit: TBD (from
4b2d932cdf9813e3fffafdd24f2ba14f02e95822)
Test-Parameters: trivial fstype=ldiskfs \
clientdistro=el8.8 serverdistro=el8.8 testlist=sanity
Test-Parameters: trivial fstype=zfs \
clientdistro=el8.8 serverdistro=el8.8 testlist=sanity
Change-Id: I4edd823b273c75618bc6dea236be8d64ed7c13ed
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52439
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Jian Yu [Wed, 20 Sep 2023 19:52:56 +0000 (12:52 -0700)]
LU-17112 kernel: update RHEL 7.9 [3.10.0-1160.99.1.el7]
Update RHEL 7.9 kernel to 3.10.0-1160.99.1.el7.
Lustre-change: https://review.whamcloud.com/52359
Lustre-commit: TBD (from
8a59bb388266e3d2e1a5683ed1d9a1dc2fbf822a)
Test-Parameters: trivial clientdistro=el7.9 serverdistro=el7.9
Change-Id: Iafb955b1927102fef4995b92d64218e36a4a8d51
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52440
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Jian Yu [Wed, 20 Sep 2023 19:04:25 +0000 (12:04 -0700)]
LU-17109 kernel: new kernel [SLES15 SP5 5.14.21-150500.55.22.1]
This patch makes changes to support new SLES15 SP5 release
with kernel 5.14.21-150500.55.22.1 for Lustre client.
Lustre-change: https://review.whamcloud.com/52340
Lustre-commit: TBD (from
c410e3c89eadd728559782f94102f283ef52d63a)
Test-Parameters: trivial clientdistro=sles15sp5 testlist=sanity
Test-Parameters: trivial clientdistro=sles15sp4 testlist=sanity
Change-Id: I278017a5c996a8cf4e3d604aa928e968ca007312
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52342
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Alexandre Ioffe [Thu, 14 Sep 2023 01:07:14 +0000 (18:07 -0700)]
EX-8232 test: use client machines additionally to OSS
Additionally to OSS nodes add replication agents to client nodes.
This makes possible testing lamigo replications on large
number of nodes.
Test-Parameters: testlist=hot-pools
Signed-off-by: Alexandre Ioffe <aioffe@ddn.com>
Change-Id: I980f95a4885991faf7d958e98fdbc7811fb1f163
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52368
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Andreas Dilger [Tue, 19 Sep 2023 18:07:26 +0000 (12:07 -0600)]
LU-15671 ost: remove REP_MBITS from OST_CONNECT2_SUPPORTED
Remove the OBD_CONNECT2_REP_MBITS flag from the OST_CONNECT2_SUPPORTED
mask on the OST that was accidentally included in a backported patch.
If newer clients that have support for REP_MBITS (e.g. 2.15.x) try to
recover with the 2.14.0-ddn91+ OSS, they will loop endlessly since
they are not exchanging the right information in the replay RPC/reply.
Test-Parameters: trivial
Fixes:
b85a12aa73 ("LU-15671 mds: do not send OST_CREATE transno interop")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I6bef75900f8efdb8a1e35545a86c580a68f9ddc8
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52417
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Lei Feng [Thu, 4 Nov 2021 11:41:06 +0000 (19:41 +0800)]
LU-15193 quota: expand QUOTA_MAX_TRANSIDS to 12
In some rare cases 12 quota ids are needed.
Usually (user, group) * (block, inode) * (inode, parent) = 8 qids
are needed. But with project id,
(user, group, project) * (block, inode) * (inode, parent) = 12 qids
are needed.
Lustre-change: https://review.whamcloud.com/45456
Lustre-commit: I4b3ee197f6e274abda06edf60b246f089fe28d10
Signed-off-by: Lei Feng <flei@whamcloud.com>
Signed-off-by: Sergey Cheremencev <scherementsev@ddn.com>
Test-Parameters: trivial testlist=sanity-quota
Change-Id: I26bcf97cbb79caee6f76dd076e1a03cd9ce3d9c5
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52410
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Sebastien Buisson [Tue, 5 Sep 2023 09:08:16 +0000 (11:08 +0200)]
LU-17015 obdclass: set cache entry/acquire expiry at init
Give the ability to define values for cache entry expire and acquire
expire directly at upcall cache init.
Lustre-change: https://review.whamcloud.com/52271
Lustre-commit: TBD (from
2d24c820f32699d66b56024ae99a7b27944f6130)
Test-Parameters: trivial
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Iee0dea66943ab6747d85a378861ae98c29faa11a
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52370
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Sebastien Buisson [Mon, 28 Aug 2023 09:37:51 +0000 (11:37 +0200)]
LU-17015 obdclass: make upcall cache hashtable size dynamic
The hash table used by the upcall cache mechanism should have an
adjustable size, depending on the purpose and context where it is
used.
Lustre-change: https://review.whamcloud.com/52128
Lustre-commit:
79f823bd40ee97a5846d828efce1080dc04a6057
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I53c5cb14f9a5630fc269d97cead9a5ca6a33895e
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Aurelien Degremont <adegremont@nvidia.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52369
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Jian Yu [Fri, 18 Aug 2023 21:51:44 +0000 (14:51 -0700)]
LU-17041 kernel: update RHEL 8.8 [4.18.0-477.21.1.el8_8]
Update RHEL 8.8 kernel to 4.18.0-477.21.1.el8_8.
Lustre-change: https://review.whamcloud.com/52003
Lustre-commit: TBD (from
4268396e4ee6e33a91b11ba4d0f77838aa3c172a)
Test-Parameters: trivial fstype=ldiskfs \
clientdistro=el8.8 serverdistro=el8.8 testlist=sanity
Test-Parameters: trivial fstype=zfs \
clientdistro=el8.8 serverdistro=el8.8 testlist=sanity
Change-Id: Ie24c8e438dd33afafb900664d9a4010160bc1a45
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52008
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Jian Yu [Wed, 13 Sep 2023 20:01:51 +0000 (13:01 -0700)]
LU-17111 kernel: update RHEL 9.2 [5.14.0-284.30.1.el9_2]
Update RHEL 9.2 kernel to 5.14.0-284.30.1.el9_2 for Lustre client.
Lustre-change: https://review.whamcloud.com/52358
Lustre-commit: TBD (from
f6f135c77911707b4c7282fedb3973a6a16e0d7d)
Test-Parameters: trivial clientdistro=el9.2 testlist=sanity
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Change-Id: Id80dbba6b4434a83cf925d6961d727941274edf4
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52365
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Wei Liu [Mon, 14 Aug 2023 19:02:24 +0000 (12:02 -0700)]
LU-16424 tests: Add version check in sanity-lnet
Skip sanity-lnet test_205, test_207, test_209 and test_254 if
version is older than 2.14.58 since the lnet_if_list
function was added in Fixes:
3166a201e0 ("LU-15398 tests: Use remote peers for health tests")
Lustre-change: https://review.whamcloud.com/c/fs/lustre-release/+/51942
Lustre-commit:
ee4f470d590dd19d9c7d188958d9305ccd666e5e
Test-Parameters: trivial testlist=sanity-lnet \
serverjob=lustre-b_es5_2 serverbuildno=591 \
serverdistro=el7.9
Signed-off-by: Wei Liu <sarah@whamcloud.com>
Change-Id: I9cd62d91980784e3b33cf4e30426bf74d17f717f
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51942
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Colin Faber <cfaber@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52379
Patrick Farrell [Thu, 15 Jun 2023 18:49:56 +0000 (14:49 -0400)]
EX-7681 scripts: Compression estimate script
ll_compression_scan is a simple tool which can be run on any
Linux system to estimate the space usage reduction from the
Lustre Client Side Data Compression (CSDC) feature with
particular compression settings (algorithm, chunk size,
and compression level).
When run on one or more directories, it will recursively
examine a percentage of files under that directory, sampling
data in those files to estimate how the files will compress.
This tools samples data throughout the file, so it should
avoid problems with poor estimates for files with headers
which differ from the bulk data in the file.
However, if the directory tree is particularly imbalanced,
with a few large uncompressible files in one directory, and
many small files in other directories, then scanning a small
percentage of files may give a misleading compression estimate.
Sampling a larger percentage of files will improve this.
This tool requires the lz4, lzop, and gzip utilities to
be installed in order to test those compression types.
(lzop is the command line utility for lzo compression)
Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I092f9608553eba10bacfcc3c4a3fafc9a454c287
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51333
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Colin Faber <cfaber@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Patrick Farrell [Wed, 13 Sep 2023 15:36:58 +0000 (11:36 -0400)]
EX-7433 osc: disable CPU-access features for RDMA only pages
Pages which cannot be accessed by the CPU are referred to
as RDMA only pages. If pages cannot be accessed by the
CPU, it is impossible for us to do compression,
encryption, checksums, or short-io (data-in-RPC) on them.
This patch disables compression and encryption for these
pages and cleans up the code so checksums and short-io
are disabled by the same code.
The only user of RDMA only pages today is Nvidia's GPU
direct, so this patch disables compression and
encryption with GPU direct.
NB: We eventually intend to handle compression for
GPU direct with server side compress/decompress.
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Iad9311617cddf27d3ff75a17429499c573067ea0
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51770
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Bobi Jam [Mon, 11 Sep 2023 12:06:02 +0000 (20:06 +0800)]
LU-16750 ldiskfs: optimize metadata allocation for hybrid LUNs
With LVM it is possible to create an LV with SSD storage at the
beginning of the LV and HDD storage at the end of the LV, and use that
to separate ext4 metadata allocations (that need small random IOs)
from data allocations (that are better suited for large sequential
IOs) depending on the type of underlying storage. Between 0.5-1.0% of
the filesystem capacity would need to be high-IOPS storage in order to
hold all of the internal metadata.
This would improve performance for inode and other metadata access,
such as ls, find, e2fsck, and in general improve file access latency,
modification, truncate, unlink, transaction commit, etc.
This patch split largest free order group lists and average fragment
size lists into other two lists for IOPS/fast storage groups, and
cr 0 / cr 1 group scanning for metadata block allocation in following
order:
if (allocate metadata blocks)
if (cr == 0)
try to find group in largest free order IOPS group list
if (cr == 1)
try to find group in fragment size IOPS group list
if (above two find failed)
fall through normal group lists as before
if (allocate data blocks)
try to find group in normal group lists as before
if (failed to find group in normal group && mb_enable_iops_data)
try to find group in IOPS groups
Non-metadata block allocation does not allocate from the IOPS groups
if non-IOPS groups are not used up.
Add for mke2fs an option to mark which blocks are in the IOPS region
of storage at format time:
-E iops=0-1024G,4096-8192G
so the ext4 mballoc code can then use the EXT4_BG_IOPS flag in the
group descriptors to decide which groups to allocate dynamic
filesystem metadata.
--
v2->v3: add sysfs mb_enable_iops_data to enable data block allocation
from IOPS groups.
v1->v2: for metadata block allocation, search in IOPS list then normal
list.
Lustre-change: https://review.whamcloud.com/51625
Lustre-commit: TBD (from
452f102a581f2a8ef8396bf0ba5584d61512a267)
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Ice2d25b8db19f67e70690f9ccebc419f253b12bd
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52121
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Bobi Jam [Fri, 21 Jul 2023 07:34:20 +0000 (15:34 +0800)]
LU-14438 ldiskfs: backport ldiskfs mballoc patches
This contains following kernel patches:
a078dff87013 ("ext4: fixup possible uninitialized variable access in
ext4_mb_choose_next_group_cr1()")
80fa46d6b9e7 ("ext4: limit the number of retries after discarding
preallocations blocks")
820897258ad3 ("ext4: Refactor code related to freeing PAs")
cf5e2ca6c990 ("ext4: mballoc: refactor
ext4_mb_discard_preallocations()")
83e80a6e3543 ("ext4: use buckets for cr 1 block scan instead of
rbtree")
a9f2a2931d0e ("ext4: use locality group preallocation for small
closed files")
1940265ede66 ("ext4: avoid unnecessary spreading of allocations among
groups")
4fca50d440cc ("ext4: make mballoc try target group first even with
mb_optimize_scan")
3fa5d23e68a3 ("ext4: reflect mb_optimize_scan value in options file")
077d0c2c78df ("ext4: make mb_optimize_scan performance mount option
work with extents")
196e402adf2e ("ext4: improve cr 0 / cr 1 group scanning")
21175ca434c5 ("ext4: make prefetch_block_bitmaps default")
3d392b2676bf ("ext4: add prefetch_block_bitmaps mount option")
cfd732377221 ("ext4: add prefetching for block allocation bitmaps")
4b68f6df1059 ("ext4: add MB_NUM_ORDERS macro")
dddcd2f9ebde ("ext4: optimize the implementation of
ext4_mb_good_group()")
a6c75eaf1103 ("ext4: add mballoc stats proc file")
67d251860461 ("ext4: drop s_mb_bal_lock and convert protected fields
to atomic")
Lustre-change: https://review.whamcloud.com/51472
Lustre-commit: TBD (from
8da59fc988f0cebcac10e8ef1faab1e4c913de03)
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I079dfb74bd743894934484803cedb683073e4d94
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52120
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Andreas Dilger [Thu, 14 Sep 2023 07:41:01 +0000 (01:41 -0600)]
RM-620 build: New tag 2.14.0-ddn102
New tag 2.14.0-ddn102
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I47ca9dbbb4c952facefa4de57331aab791d72ed2
Andreas Dilger [Thu, 14 Sep 2023 07:40:38 +0000 (01:40 -0600)]
RM-620 build: New tag lipe-2.31
New tag lipe-2.31
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I8fe1d357b47950452f751d7e833f7e5a84c0a867
Alexandre Ioffe [Wed, 6 Sep 2023 08:17:59 +0000 (01:17 -0700)]
EX-7290 lipe: lipe_find3 get attr warnings
Report each get attr error when command line
option --warnings=get-attr.
Count all get attr errors per attr type and report them at the end.
Exclude the 'trusted.link' when scanning an OST.
Test-Parameters: trivial testlist=sanity-lipe-scan3,sanity-lipe-find3,sanityn
Signed-off-by: Alexandre Ioffe <aioffe@ddn.com>
Change-Id: I5e9b226bd4046eddcf779ca06af0892589d447ac
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52292
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Alex Zhuravlev [Fri, 19 Nov 2021 19:52:28 +0000 (22:52 +0300)]
LU-10729 tests: replay-dual/23d to wait
replay-dual/23d simulates a dropped reply for the executed
update, but previous tests can break this:
- the update modifies remote llog
- there can be another uptdate to that remote log
(from the previous tests)
- fail_loc (OBD_FAIL_UPDATE_OBJ_NET) is applied to the
old update
- the 23d's update gets stuck
so the test has to ensure there is no pending/in-flight
updates.
Lustre-change: https://review.whamcloud.com/45623
Lustre-commit:
63a19f6f666b9d18fede66ce8bcd2d799b5e0fa7
Test-Parameters: trivial testlist=replay-dual mdscount=2 mdtcount=4
Test-Parameters: testlist=replay-dual mdscount=2 mdtcount=4
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I3b60468d1f6f467006d5872ec62b81f57fa0423e
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52334
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Sebastien Buisson [Mon, 11 Sep 2023 15:31:09 +0000 (17:31 +0200)]
LU-17108 nodemap: make map_mode available for default nm
The map_mode property lets control the way mapping is carried out. It
is already available on regular nodemaps, to decide whether uids, gids
and/or projids will be mapped.
On the default nodemap, where it is not possible to define mappings,
the map_mode property will be taken into account when trusted is 0 and
deny_unknown is 0. Unmapped IDs will be left unchanged.
Lustre-change: https://review.whamcloud.com/52336
Lustre-commit: TBD (from
613ca001049887b1dc0cb2501f566c263ff7a006)
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I16a2f5cfda11a8435b56a00f3e97bdc70741c156
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52337
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Shaun Tancheff [Wed, 13 Sep 2023 05:12:35 +0000 (22:12 -0700)]
LU-16534 build: Prefer timer_delete[_sync]
Linux commit v6.1-rc1-7-g9a5a30568697
timers: Get rid of del_singleshot_timer_sync()
Linux commit v6.1-rc1-11-g9b13df3fb64e
timers: Rename del_timer_sync() to timer_delete_sync()
Linux commit v6.1-rc1-12-gbb663f0f3c39
timers: Rename del_timer() to timer_delete()
Prefer timer_delete_sync() to del_singleshot_timer_sync()
Prefer timer_delete_sync() to del_timer_sync()
Prefer del_timer() to timer_delete()
Provide del_timer and del_timer_sync when
timer_delete[_sync] is not available
Lustre-change: https://review.whamcloud.com/49922
Lustre-commit:
0ec89529ce14a1bb5af0c01ed86424a10e0e373c
Test-Parameters: trivial
HPE-bug-id: LUS-11470
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I4c946c315a83482dd0bd69e5e89f0302a67bf81c
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52357
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Chris Horn [Fri, 27 Jan 2023 20:47:25 +0000 (14:47 -0600)]
LU-16483 ptlrpc: Track highest reply XID
Keep track of the highest XID that we've received a reply for.
When an OBD_PING expires, do not disconnect the import if the failed
XID is less than or equal to the last reply XID. This avoids situation
where a lost OBD_PING rpc causes a reconnect even though we've
completed other RPCs in the meantime.
Lustre-change: https://review.whamcloud.com/49807
Lustre-commit:
eb1f4a5222039be9f728839ec8f9cde904a1273f
LU-16483 tests: replay-single test_200 fixes
Modify test to ensure idle disconnect is enabled for all targets
except OST0000. This prevents an issue where an idle ping is sent to
another target instead of OST0000.
Re-work test to check the debug log for all relevant messages.
rcli is not set correctly when RCLIENTS contains multiple hostnames.
Fix it by not surrounding RCLIENTS with double quotes.
Added a debug statement to ptl_send_rpc(), and moved an existing one,
to faciliate debugging any future test failures.
Lustre-change: https://review.whamcloud.com/50891
Lustre-commit:
fdfdf5c05cf64294068a5cbfe818b64bd9e577f9
HPE-bug-id: LUS-11474
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I7e66bcc1368fa41ec86ffd843abac676f8d29254
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52321
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Patrick Farrell [Wed, 13 Sep 2023 15:25:30 +0000 (11:25 -0400)]
EX-7818 osc: don't check for start inside the chunk
Chunk size is the same for the whole request and every
chunk offset is multiple to a chunk size.
No need to search for compression header in every page.
It is enough to check every with offset multiple to a chunk
size.
Test-Parameters: testlist=sanity-compr env=ONLY="1-6"
Signed-off-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ie2ef645130656279e152ea1f7e6db01cb33836ca
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51650
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Hongchao Zhang [Thu, 13 Jul 2023 02:56:58 +0000 (10:56 +0800)]
EX-8190 quota: fix race in qmt_seed_glbe
There is a deadlock in qmt_pool_recalc:
The rw_semaphore "qmt_pool_info.qpi_sarr.osts.op_rw_sem" has been
acquired in qmt_pool_recalc (read mode), but it is acquired once
more in qmt_seed_glbe_all (read mode), and it will be stuck if
there is pending write mode lock acquisition.
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Change-Id: Ib7db17700a90feaa9bfe8300bab509567ac1ed21
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52346
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sergey Cheremencev <scherementsev@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Lai Siyao [Fri, 7 Jul 2023 09:21:05 +0000 (05:21 -0400)]
LU-11457 osd-ldiskfs: scrub FID reuse
It's possible that two inodes back point to the same FID, check
inodes in osd_scrub_check_update() to decide which mapping
should be kept:
* if one inode doesn't exist, its mapping is stale.
* if one inode mtime is after the other one, keep this mapping.
* if two inode mtimes equal, and one inode size is not 0, keep its
mapping, otherwise two inode sizes are 0, just keep the existing
mapping.
Remove IDIF support in osd_scrub_check_update() to simplify
code logic.
Add sanity-scrub 4e to verify it.
Lustre-change: https://review.whamcloud.com/51601
Lustre-commit:
dc53daaaf9e158355edb8f6021123fed9a1429ef (TBD)
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Ida020c2852c66f1a8910845bd16ab4c882858a4e
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52037
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Minh Diep [Wed, 6 Sep 2023 03:00:21 +0000 (20:00 -0700)]
EX-8187 build: add kernel to MOFED rpms
We need to build MOFED that tied to kernel version,
otherwise it won't be installed on kernel update
Test-Parameters: trivial
Change-Id: I167052aab438493a1301515459488c8085087293
Signed-off-b: Minh Diep <mdiep@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52286
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Gaurang Tapase <gtapase@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Hongchao Zhang [Mon, 14 Aug 2023 07:28:17 +0000 (15:28 +0800)]
LU-16988 mdd: update projid when merging layout
When creating mirrors by the special directory ".lustre/fid",
the project ID could not be set correctly, which causes
wrong quota calculation for the projid.
Lustre-change: https://review.whamcloud.com/51859
Lustre-commit:
bb2525b0ddf9190ae340552fa615833b735b61d3
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Change-Id: Ia4c3a8973b8c467642e12629d36fa42d64162084
Reviewed-by: Sergey Cheremencev <scherementsev@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52303
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Thomas Bertschinger [Fri, 5 May 2023 21:05:22 +0000 (17:05 -0400)]
LU-13031 jobstats: store jobid in xattr when files are created
This change stores the jobid of the process that creates a file in an
extended attribute in the file's MDT inode, at file creation time.
The name of the xattr is determined by a new sysfs parameter
"mdt.*.job_xattr" so that the admin can choose a name that does
not conflict with other uses they may have for a given xattr.
The default value is "user.job". A value of "NONE" means that
the jobid will not be stored in the inode.
If the name is in the user namespace "user.", then the name portion
can be up to 7 alphanumeric characters long. The admin can choose
the trusted namespace to prevent users from modifying the value,
but only "trusted.job" is allowed in this namespace.
Allowing users to modify the contents of the xattr is helpful so
that the jobid can be preserved even when files are moved with tools
like `cp` or `rsync`, and when copied from one filesystem to another.
Lustre-change: https://review.whamcloud.com/50982
Lustre-commit:
23a2db28dcf1422a6a6da575e907fd257106d402
LU-13031 tests: skip sanity/test_205h,205i in interop
Skip sanity tests 205h and 205i when the MDS version is too old
to have the jobid xattr changes. Fix test 103a to not try to set
the job_xattr parameter when it does not exist.
Lustre-change: https://review.whamcloud.com/52095
Lustre-commit:
a6df532162556028c2ab9b974989fc0cca68d4fe
Test-Parameters: testlist=sanity clientdistro=el8.8 clientjob=lustre-b_es-reviews clientbuildno=12634 env=ONLY=103
Signed-off-by: Thomas Bertschinger <bertschinger@lanl.gov>
Change-Id: Iad78a5ec6fbc4b761ff481141763bdd0cdcd0128
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52195
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Andreas Dilger [Wed, 6 Sep 2023 21:25:19 +0000 (15:25 -0600)]
LU-13798 llite: fix LL_SBI_FLAGS array again
Errors are still being printed by ll_sbi_flags_seq_show():
exa6: Revise array LL_SBI_FLAGS to match sbi flags please.
This is because LL_SBI_PARALLEL_DIO is a negative int so downshift
does not clear the high bits. Make ll_flags unsigned to avoid this.
Move the LL_SBI_SNAPSHOT flag out of the way of other flags.
This is in-memory only, so it doesn't matter what value is used.
Test-Parameters: trivial
Fixes:
00152903a180 ("LU-13798 llite: parallelize direct i/o issuance")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I4b695da74a94d5f204804aa5ab16f83688f7a7f0
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52299
Sebastien Buisson [Fri, 2 Jun 2023 13:27:01 +0000 (13:27 +0000)]
EX-7331 sec: add support for encryption plus compression
For compression efficiency, encryption must be carried out on write
after data has been compressed. Otherwise, encrypted data would be
almost incompressible. And on read, decryption must occur before data
is decompressed.
This means encryption is called on pages produced as a result of
compression. However, for encryption to work, pages need to have a
proper mapping and index.
So we need to manually copy index and mapping from the original page
cache pages, to the pages used to store compressed data. In case of
Direct IO, we leverage information available from the cl_page.
Add sanity-sec test_66 to exercise encryption+compression.
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Id54d41365d5a21c54611b8e4af5059088ef87183
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51216
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Andreas Dilger [Fri, 8 Sep 2023 22:35:42 +0000 (16:35 -0600)]
EX-8213 tests: skip conf-sanity test_33c in interop
Skip conf-sanity test_33c in interop because this is exercising
memory corruption on the MDS, which can cause it to crash.
Test-Parameters: trivial testlist=conf-sanity env=ONLY=33 serverversion=EXA6.2.0
Fixes:
6a7a2de555 ("LU-17034 tests: memory corruption in PQ")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ia0f194b8337f6666cbb292de816f0451d281ed26
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52325
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Patrick Farrell [Tue, 22 Aug 2023 16:32:52 +0000 (12:32 -0400)]
LU-16541 tests: Improve test 64f
The buffered IO part of test 64f has several timing related
holes and other oddities. The use of multiop in the
background does not guarantee the RPC will not be sent, AND
the test doesn't kill it correctly.
Clean this up and make a more reliable version of the test.
Hopefully this will resolve the failure issues, if not, a
better version of the test will allow debugging.
Lustre-commit:
33e4d86a480b860e0a3b4b51c7c6da6ec0159e51
Lustre-change: https://review.whamcloud.com/52040
Test-Parameters: trivial
Test-Parameters: testlist=sanity envdefinitions=ONLY=64f,ONLY_REPEAT=20
Test-Parameters: testlist=sanity envdefinitions=ONLY=64f,ONLY_REPEAT=20
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I25b825e1d9d516635ef8cbd26dd12809625c34df
Reviewed-by: xinliang <xinliang.liu@linaro.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52316
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Sergey Cheremencev [Fri, 8 Sep 2023 14:54:58 +0000 (17:54 +0300)]
EX-8212 tests: fix conf-sanity_33c
Fix conf-sanity.sh test_33c failure in case of MDSCOUNT > 1.
This test is not yet on master, so this patch not needed there.
Test-Parameters: trivial testlist=conf-sanity
Fixes:
6a7a2de555 ("LU-17034 tests: memory corruption in PQ")
Signed-off-by: Sergey Cheremencev <scherementsev@ddn.com>
Change-Id: I37b73a4827895f00cf325cb5cb2da3157a27dc47
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52319
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Andreas Dilger [Thu, 7 Sep 2023 00:43:57 +0000 (18:43 -0600)]
RM-620 build: New tag 2.14.0-ddn101
New tag 2.14.0-ddn101
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I6569765c5f2f43f800b5ba39499174e03b3247fb
Andreas Dilger [Thu, 7 Sep 2023 00:43:44 +0000 (18:43 -0600)]
RM-620 build: New tag lipe-2.30
New tag lipe-2.30
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Idecd8fed3211452396ce8a3f04bd5a08f8740a04
Patrick Farrell [Wed, 26 Jul 2023 16:34:49 +0000 (12:34 -0400)]
EX-7601 llite: round LDLM lock requests to chunk
When we do IO with compression, we may need to 'fill' the
compression chunk, reading up pages which have already been
written to storage, so we can compress the whole chunk.
Doing this safely requires that any dlmlock we're using
always covers the full chunk.
The easiest way to do this is to round the entire locking
process to include leading or trailing compression chunks.
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I3c365844561d0da909e6290f4b58ef2211c2d255
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51266
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Patrick Farrell [Fri, 30 Jun 2023 21:15:30 +0000 (17:15 -0400)]
EX-7601 llite: allow aligned DIO with compression
If a DIO is fully aligned to compression chunk boundaries,
it is safe to do on a compressed file, so allow it.
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: If8fa3397c9424254538738f4d77f9f50d1c21129
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51529
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Patrick Farrell [Wed, 5 Jul 2023 19:12:07 +0000 (15:12 -0400)]
EX-7601 llite: Compute compression chunk ranges
Determine the edges of any leading and trailing compression
chunks touched by this IO and store them in the cl_io
struct.
The functionality in this patch also allows us to adjust
the lock and read rounding to do them more intelligently,
this will be done in a future patch.
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I526563ea347fb0246f97f3532b823c4345c3fa27
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51324
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Patrick Farrell [Wed, 16 Aug 2023 16:41:54 +0000 (12:41 -0400)]
EX-6269 osc: Add BRW_COMPRESSED flag to reads
We need to add the BRW_COMPRESSED flag to reads so servers
can know if the client is able to decompress data.
This lets servers decide if a client can be sent compressed
data and the result won't be nonsense/corruption. This is
important for future support of GPU direct, where the
server will need to do the decompression.
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I36b5b73f983ce8f2e5297c3e9dc778a5eca54e6a
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51960
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Jian Yu [Mon, 4 Sep 2023 07:10:45 +0000 (00:10 -0700)]
LU-16661 build: use "Recommends: perl" for lustre-iokit
In lustre-iokit, the "plot" commands all use perl, but
the actual "*-survey" scripts are written in bash, so
the "Requires: perl" in lustre.spec.in for lustre-iokit
could be downgraded to "Recommends: perl" for RHEL 8+
(RHEL 7 does not handle "Recommends:").
Lustre-change: https://review.whamcloud.com/52225
Lustre-commit: TBD (from
b5b348bc165d7cacea6fb15e380851b6d676a5e0)
Test-Parameters: trivial testlist=obdfilter-survey
Change-Id: I55f3c57e73ac91cedce745dc4f424c3542978cd4
Fixes:
800a9ec58f78 ("LU-16661 build: improve lustre.spec.in Requires")
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52232
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Vitaliy Kuznetsov [Fri, 1 Sep 2023 16:06:18 +0000 (20:06 +0400)]
EX-8180 lipe: Fix typos, code style, size display in stats
This patch fixes grammatical errors, incorrect display
of statistics in some reports due to incorrect size
alignment, indentation and alignment up to
80 characters where possible. Renamed some types and
variable names. Added additional help information.
Test-Parameters: trivial testlist=sanity-lipe-scan3
Signed-off-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Change-Id: I97692f006391ddfa5b6e474936e2a8edde69d8c8
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52221
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Sebastien Buisson [Wed, 10 May 2023 12:49:00 +0000 (14:49 +0200)]
LU-16816 obdclass: make import_event more robust
Make mdc_import_event and osc_import_event more robust, by not
assuming input variables can be dereferenced.
Lustre-change: https://review.whamcloud.com/50915
Lustre-commit:
8d24aa6b8e662a2aa52af4ee652c9f01c2c26cc4
Test-Parameters: trivial
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I31a6477d58b7bb9a557ea561f7b0fa3fbcae5762
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52220
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Andreas Dilger [Tue, 7 Dec 2021 19:12:00 +0000 (12:12 -0700)]
LU-15338 tests: check whole jobid in sanity 205a
Check the whole jobid string in sanity test_205a to avoid matching
a substring of the jobid twice. This could only currently happen
for the second "dd" test, at a rate about 1/8192, but might also
fail in the future if other tests are added.
Lustre-change: https://review.whamcloud.com/45774
Lustre-commit:
1ee894a4355ecec869754c0b6c566c0e187e27a7
Test-Parameters: trivial testlist=sanity env=ONLY=205a,ONLY_REPEAT=200
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I34b7ed1a7825e3fbad9ea8666fccb2bdc53ebbe5
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52294
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Lei Feng [Wed, 6 Sep 2023 02:39:49 +0000 (10:39 +0800)]
LU-17091 tests: check correct return value in lfs_df
$? is the return value of last command in a pipe.
We should check the return value of first command 'lfs df'
in this case.
Lustre-change: https://review.whamcloud.com/52285
Lustre-commit: TBD (from
8d0b87768034a766a301f945b7a51bf3a3cf0c40)
Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: trivial testlist=sanityn
Change-Id: I7daa38f27c878e5195181ed82717cd28ca345dbc
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52291
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Patrick Farrell [Fri, 19 May 2023 16:24:25 +0000 (12:24 -0400)]
LU-16838 tests: use import name in 398a
The LU-15670 test change assumes ost1_import is always
OST0000. This isn't quite always true, so the test is
failing in certain configurations.
Change it to use the import name.
Lustre-change: https://review.whamcloud.com/51064
Lustre-commit: TBD (from
1927445073bc49c0941e72528590f626b80e9c8f)
Fixes:
649d638467 ("LU-15670 clio: Disable lockless for DIO with O_APPEND")
Test-parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ifaefc503d1118ecd6fd45b661cbe94607f7ad799
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52287
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Andreas Dilger [Tue, 5 Sep 2023 20:18:31 +0000 (14:18 -0600)]
LU-16661 build: remove -dev packages for Debian
Don't depend on libmount-dev, libsnmp-dev, libkeyutils-dev for the
lustre-client-utils and lustre-server-utils packages. These are
only needed for build and for the lustre-client-dkms package.
Disable SNMP by default as this is no longer used anywhere.
Lustre-change: https://review.whamcloud.com/52281
Lustre-commit: TBD (from
4bfc45e048d4372332defa3c438b480ed68111f6)
Test-Parameters: trivial testlist=runtests clientdistro=ubuntu2204
Test-Parameters: trivial testlist=runtests clientdistro=ubuntu2004
Fixes:
7dc6e1128a ("LU-15888 build: Debian dkms-debs requires ed and libkeyutils")
Fixes:
af2f77633b ("LU-13818 build: use libsnmp-dev instead of libsnmp30")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ib788a97028ee40a9c61070d00b823620ec3ebbe5
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52282
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Andreas Dilger [Wed, 6 Sep 2023 17:15:20 +0000 (17:15 +0000)]
DDN-4216 revert: LU-16843 ldiskfs: merge extent blocks
This reverts commit
ce3a417f2403b421282c825d96cb5297852a80de.
This is potentially causing ldiskfs issues and isn't widely hit.
Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I23040745a355a60ec205a0996374859695a7db55
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52298
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Andreas Dilger [Fri, 1 Sep 2023 13:25:18 +0000 (07:25 -0600)]
RM-620 build: New tag 2.14.0-ddn100
New tag 2.14.0-ddn100
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I4a8858823099df8791a421a812a4844b51c8f00a
Andreas Dilger [Fri, 1 Sep 2023 13:24:53 +0000 (07:24 -0600)]
RM-620 build: New tag lipe-2.29
New tag lipe-2.29
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I0b9a7a2929db6e9fbe2c8463f35dce58937ca7a6
Jian Yu [Tue, 8 Aug 2023 01:12:04 +0000 (18:12 -0700)]
LU-17020 kernel: update RHEL 9.2 [5.14.0-284.25.1.el9_2]
Update RHEL 9.2 kernel to 5.14.0-284.25.1.el9_2 for Lustre client.
Lustre-change: https://review.whamcloud.com/51886
Lustre-commit:
39df815cd6bf0a9dcd0a5e034b749726b194c953
Test-Parameters: trivial clientdistro=el9.2 testlist=sanity
Test-Parameters: trivial clientdistro=el9.2 serverdistro=el8.8 testlist=sanity
Change-Id: Icdbd9cfa18a72d3e6f09f366952e6e0f2ac1ebd2
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51887
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Sebastien Buisson [Thu, 10 Aug 2023 11:05:52 +0000 (13:05 +0200)]
LU-17023 krb: use a Kerberos realm different from default
It makes sense to give the ability to specify a Kerberos realm that is
different from the default realm as returned by
krb5_get_default_realm().
On client side, the desired realm needs to be specified via the new
'-R' option to lgss_keyring. This can be specified in the config file
/etc/request-key.d/lgssc.conf to replace the default domain, e.g.:
create lgssc * * /usr/sbin/lgss_keyring -R DOMAIN.COM %o %k %t %d %c %u %g %T %P %S
On server side, the desired realm can be specified via the new '-R'
parameter of the lsvcgssd daemon, replacing the default realm.
This patch adds sanity-krb5 test_1b to exercise the new realm options,
by just re-using the same realm as the test system is configured to
use. And former test_1 is renamed test_1a.
Lustre-change: https://review.whamcloud.com/51914
Lustre-commit: TBD (from
7865105966ce9b302504afaa2b1f95b5c2ef48c4)
Test-Parameters: kerberos=true testlist=sanity-krb5
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I9c91d5cb9904781d546e77b1e46115fed433618f
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52151
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Vitaliy Kuznetsov [Tue, 29 Aug 2023 15:38:59 +0000 (19:38 +0400)]
EX-6685 lipe: Add collection of files sizes statistics
Add the ability to collect files sizes statistics to lipe_scan3
and lipe_find3.
This improvement adds an additional "collect-fsize-stats" option
to generate a file size statistics report for the conditions
specified in the search request. For example, you can collect
statistics on files for a specific user, on a specific modification
time, size, and other available parameters in files.
collect-fsize-stats is added as a new output policy and excludes
other output options (such as print-file-fid ) when this option
is specified. All calculations are made in KB, as this reduces
the potential volume of output data and makes it easier to work
with them through other programs.
Statistics are printed in 3 formats, .out for easy reading, JSON
and in YAML format. The template formatted output of the reports
themselves was copied from fsstats, but works on the lipe engine.
The report itself consists of the following tables showing
statistics on file sizes:
- Files Size;
- Capacity used;
- Equal overhead;
- Positive overhead;
- Negative overhead;
- Directory size;
- Time since creation;
- Time since last modification;
- Time since last metadata modification;
- Time since last access;
- Filename length;
- Storage size by user;
- Storage size by group;
- Storage size by project ID;
- Stripe count;
- Stripe size;
- Mirror count.
Each of the tables has the following structure:
- Header (Stats name);
- Description of the table for each column (from 0 to 8);
- General values relative to table values and stats type;
- Table with 8 columns.
Also generates time-based reports for each user.
Types:
- Time since creation;
- Time since last modification;
- Time since last metadata modification;
- Time since last access;
Each of the tables has the following structure:
- User info
- Header (Stats name);
- Table with 6 or 7 columns.
The files sizes report file can be generated in 4 output
options (out,yaml,json,csv). To specify the desired type of
report, you need to specify the extension from the available
ones (out, yaml, json, csv) in the file name (For example:
report_name.json). If you want to receive reports of all
types, you must specify the extension ".all" (For example:
report_name.all). In this case, reports of all types
will be generated.
Test-Parameters: trivial testlist=sanity-lipe-scan3
Signed-off-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Change-Id: Ied75497e9a53fe5545a0963560ca0638b4f48c76
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/50713
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Patrick Farrell [Tue, 30 May 2023 21:48:15 +0000 (17:48 -0400)]
EX-7585 pcc: parallel data copy for attach
This patch parallelize the data copying work for pcc attach
by using multiple threads in the ll_fid_path_copy helper.
Nvidia provided performance numbers for this from their
environment. This was with 4 MiB I/O size, they reported
speed was similar but *slightly* lower at larger block
sizes. This is probably an EXT4 limitation since Lustre
speed scales with those larger sizes. (As PCC attach is a
copy from Lustre to EXT4.)
This is for attaching a single 2 TiB file, they also
reported no performance regression for datasets with many
small files.
threads: 1 2 4 8
speed: 4 GiB/s 7.8 GiB/s 14.1 GiB/s 15.2 GiB/s
Performance improved only very slightly past 8 threads,
and 4 threads is clearly the sweet spot for performance.
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Iffbb3892cfb5b2e71afe15d03f9aec9c84975092
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51171
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Bobi Jam [Thu, 17 Aug 2023 17:00:03 +0000 (01:00 +0800)]
LU-10026 csdc: DoM pattern could be a combined value
Contains fix from master that was missing on patch landed to b_es6_0.
Fix a minor glitch for lov_getstripe_old code path (in
ll_lov_getstripe_ea_info), which intends to return the last component
stripe info but the commit
abf04e7ea3 omits to correctly set the
last component stripe info before using it.
Lustre-change: https://review.whamcloud.com/51978
Lustre-commit: TBD (from
dc654837688fc90320601ec12be140b413b044b2)
Fixes:
b0b262fe09 ("EX-7806 csdc: set DoM compression component")
Fixes:
abf04e7ea3 ("LU-14337 lov: return valid stripe_count/size for PFL files")
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Id0779c30c004b6979f88bf96b7b7b74a8b8c26e4
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52171
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Sebastien Buisson [Thu, 24 Aug 2023 09:40:46 +0000 (11:40 +0200)]
LU-17050 tests: test Kerberos env in sanity-krb5
Test Kerberos environnement is sane before trying to launch
sanity-krb5 tests.
Lustre-change: https://review.whamcloud.com/52068
Lustre-commit: TBD (from
8e8dc1b7e715f46b234ae0b018e2ccb464658df4)
Test-Parameters: trivial kerberos=true testlist=sanity-krb5
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I1675ba7db8c62687c69359a15cc931b5dfd40018
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52150
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Alex Deiter [Thu, 22 Jun 2023 13:28:48 +0000 (17:28 +0400)]
LU-14697 tests: change performance-sanity to use mdtest
Replace mdsrate by mdtest in performance-sanity.sh
Lustre-change: https://review.whamcloud.com/51414
Lustre-commit:
01d16dadab75e5015c494cfa52a783a87f8bc8bf
Test-Parameters: trivial
Test-Parameters: testlist=performance-sanity clientdistro=el7.9
Test-Parameters: testlist=performance-sanity clientdistro=el8.8
Test-Parameters: testlist=performance-sanity clientdistro=el9.2
Test-Parameters: testlist=performance-sanity clientdistro=ubuntu2204
Signed-off-by: Alex Deiter <adeiter@tintri.com>
Change-Id: I1a80bab4ccbe085d3ff8d8b332c8e117e14ea9cb
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52172
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Arshad Hussain [Tue, 18 Jul 2023 11:50:47 +0000 (17:20 +0530)]
LU-16605 lfs: Add -0 option to fid2path
Currently fid2path adds '\n' after printing
pathnames. Add '-0' option to fid2path to
add NUL('\0') at the end after printing out
pathnames instead of '\n'. This allows
pathnames that contain newlines('\n') to be
correctly interpreted by binaries like xargs.
Without -0 option:
$ lfs fid2path /mnt/lustre 0x200000401:0x1:0x0
/mnt/lustre/Test
_file
/mnt/lustre/Link_
file
With -0 option:
$ lfs fid2path -0 /mnt/lustre 0x200000401:0x1:0x0 | xargs --null
/mnt/lustre/test
_file /mnt/lustre/link_
file
Test-case sanity/226e added.
Lustre-change: https://review.whamcloud.com/51736
Lustre-commit:
8d4f9d1befb9962335d4cbc5b89cafced286b066
Test-Parameters: trivial testlist=sanity
Reported-by: Simon Westersund <simon.westersund@csc.fi>
Signed-off-by: Simon Westersund <simon.westersund@csc.fi>
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I9e3e32cde6c6abe83df48afd191ec167c74ac7e6
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52213
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Sergey Cheremencev [Thu, 31 Aug 2023 13:11:52 +0000 (17:11 +0400)]
LU-17034 tests: memory corruption in PQ
Add conf-sanity_33c to test that there is no
memory corruption in PQ. The test uses OST
with index 0x7c6 to cause access out of
lqeg_arr which size is 64 by default.
Test-Parameters: trivial testlist=conf-sanity env=ONLY=33c
Signed-off-by: Sergey Cheremencev <scherementsev@ddn.com>
Change-Id: I401ce80b86701ff611df5f7078b6aecad147d6db
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52198
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Sergey Cheremencev [Thu, 24 Aug 2023 00:57:10 +0000 (04:57 +0400)]
LU-17034 quota: tmp fix against memory corruption
Change QMT_INIT_SLV_CNT from 64 to 2000 to avoid accessing
memory out of array lqeg_arr. It could happen when at least
one of OSTs has index larger than the whole number of OSTs.
It is a temporary solution and maximum supported OST index
is 0x7d0. Later it will be changed with the longterm
solution.
Signed-off-by: Sergey Cheremencev <scherementsev@ddn.com>
Change-Id: Ic892352b7e833c58ea14bb7cfb98b4946f4ca9bb
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52180
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Andreas Dilger [Fri, 1 Sep 2023 00:08:30 +0000 (18:08 -0600)]
LU-17005 obdclass: only print seconds for job_stats
Only print the seconds field for the job_stats snapshot_time: field.
Otherwise the addition of the usec and "secs.usec" units field can
break the output parsing.
There is no way to print only the seconds in lprocfs_stats_header(),
since all of the other snapshot_time: fields previously printed the
microseconds field also, so use seq_printf() with the old format.
Test-Parameters: trivial
Fixes:
5efb892396e3 ("LU-11407 obdclass: add start time to stats files")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I1bda709e3bc4231d42f6a98e7487f0b11445f056
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52212
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Feng Lei <flei@whamcloud.com>
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Serguei Smirnov [Thu, 31 Aug 2023 19:34:00 +0000 (12:34 -0700)]
LU-17071 o2iblnd: IBLND_REJECT_EARLY condition causes LBUG
The message printed when kiblnd_passive_connect recognizes
IBLND_REJECT_EARLY condition introduced by LU-16393 is trying
to derefence a NULL pointer in the parameter list. Fix this.
Lustre-change: https://review.whamcloud.com/52202
Lustre-commit: TBD (from
a0fa2440765ee81b173de810b85a5bdb325bd274)
Test-parameters: trivial
Fixes:
1ea489f05b3 ("LU-16393 o2iblnd: add IBLND_REJECT_EARLY reject reason")
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: I711e5855383c140b9f7c35b27f48995f3f0e25ee
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52211
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Gaurang Tapase <gtapase@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Andreas Dilger [Mon, 28 Aug 2023 16:24:25 +0000 (10:24 -0600)]
RM-620 build: New tag 2.14.0-ddn99
New tag 2.14.0-ddn99
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I2a58d93788f4bf1d7dbf74dc1490677252de95ce
Patrick Farrell [Wed, 26 Jul 2023 17:04:20 +0000 (13:04 -0400)]
EX-7601 llite: rename compr information in io
The compression related information in the cl_io is named
poorly. It's not that the IO is compressed, it's that the
IO is to a compressed file. The compr_chunk_log_bits is
the maximum from the entire file, not the maximum hit by
this IO. (This is used by readahead, so the maximum in the
whole file is what's desired.)
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I46981a98628f127e7b147280caaf7544fa288786
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51771
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Patrick Farrell [Wed, 28 Jun 2023 18:29:50 +0000 (14:29 -0400)]
EX-7601 lov: refactor lov_io_lsme_at
lov_io_lsme_at needs some minor changes to be called from
lov_io_slice_init().
Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I0611d66052e22d349932eb26257369e07b9b8167
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51495
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Qian Yingjin [Tue, 15 Aug 2023 08:54:54 +0000 (04:54 -0400)]
EX-8027 pcc: add wait option when remove a PCC backend
In this patch, we add a "wait" option for the PCC tool when remove
PCC backend from a client:
lctl pcc del --wait $MOUNT $pcc_path
lctl pcc clear --wait $MOUNT
With this option, the caller must wait for all in-progress
attaches finished when remove the PCC backend from a client.
Change-Id: Ic8386329087a7129b0583fa823cbb50673893d0d
Signed-off-by: Qian Yingjin <qian@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51944
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Qian Yingjin [Mon, 14 Aug 2023 07:32:11 +0000 (03:32 -0400)]
EX-8027 pcc: wait for in-progress attaches when remove PCC
When remove a PCC backend from a client, it should wait for all
in-progress attaches finished. Othwise, it results in the failure
of the PCC backend umount operation.
The reason is that the PCC copy is referenced in the kernel, not
used by any applications in user space and the tool "lsof" can not
check whether the target PCC backend is used or not.
Change-Id: I05b268e75841f9f17e77819ed20c85c78d7c6ad6
Signed-off-by: Qian Yingjin <qian@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51940
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Feng Lei <flei@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Henri Doreau [Fri, 6 Feb 2015 09:01:36 +0000 (10:01 +0100)]
LU-7073 tests: Add file migration to racer
Make racer run both blocking and non-blocking "lfs migrate" commands.
Implement this within the file_create.sh script, since it is already
selecting among different layout types.
Update Makefile.am to avoid listing every racer filename explicitly
to make it easier to add new types of operations in the future.
Lustre-change: https://review.whamcloud.com/c/fs/lustre-release/+/13669
Lustre-commit:
e83569da38138859a51c660dfb5ca5bf45c70a37
Test-Parameters: trivial testlist=racer,racer,racer
Test-Parameters: fstype=zfs testlist=racer,racer,racer
Signed-off-by: Henri Doreau <henri.doreau@cea.fr>
Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Change-Id: I51b3f19c78029ff47102e96a71ec4a0fc472183a
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Zhenyu Xu <bobijam@hotmail.com>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52069
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Aurelien Degremont [Tue, 15 Aug 2023 14:03:07 +0000 (16:03 +0200)]
LU-17015 gss: support large kerberos token on client
If the current Kerberos setup is using large token, like
when PAC feature is enabled for Kerberos, client can crash.
Return an error instead of asserting to avoid the crash
and increase the default buffer size to 4kB instead of 1kB.
This will only increase the SEC_CTX_INIT request size, and
the buffer is shrunk before being sent over the wire.
This will allow security token up to 2kB to be properly
handled by Lustre. Above that size, a different issue will
happen on server side that will require another patch.
Lustre-change: https://review.whamcloud.com/51946
Lustre-commit: TBD (from
374417f3f7c1e74e402a01ae9737ff01334d1dd4)
Test-Parameters: trivial kerberos=true testlist=sanity-krb5
Signed-off-by: Aurelien Degremont <adegremont@nvidia.com>
Change-Id: I9ce30ee7f8c95bfe41525c49986ffac45ffac97c
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51951
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Alex Zhuravlev [Wed, 2 Aug 2023 10:31:57 +0000 (13:31 +0300)]
LU-17011 utils: monotonic clock in lfs mirror
use monotonic clocks instead of realtime to avoid affecting
bandwidth or hanging the transfer if the clock is changed.
Lustre-change: https://review.whamcloud.com/51852
Lustre-commit: TBD (from
81498f782e7c31e6e950352f4dbb2aa6f8052131)
Test-Parameters: trivial
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I58cf327d235448e93fa2ed63cefdf4dd01306e71
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51896
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Serguei Smirnov [Tue, 11 Jul 2023 22:40:37 +0000 (15:40 -0700)]
LU-16949 lnet: get monitor thread to update ping buffer
Make sure that ping buffer updates requested by o2iblnd and
socklnd are performed by the LNet monitor thread.
Having the LNDs do these updates via an LNet API directly caused a
lock-up due to spinlock acquisition while in an interrupt context
in Centos 7.9 environment.
To avoid LNet trying to update the ping buffer for an LNI which is
still initializing, check that o2iblnd net is fully initialized
(IBLND_INIT_ALL) before requesting the ping buffer update.
Lustre-change: https://review.whamcloud.com/51635/
Lustre-commit:
7ac399c5aec01186ad4c9a7153aea400777c897f
Fixes:
da230373bd ("LU-16563 lnet: use discovered ni status")
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: I87ff8791937f5a0ead6096ff33e8c0a8087f8ddd
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51704
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Lei Feng [Thu, 3 Aug 2023 09:44:15 +0000 (17:44 +0800)]
LU-17013 lov: fill FIEMAP_EXTENT_LAST flag
If file has N extents and get the fiemap with exactly N
extent slots, the last extent will miss FIEMAP_EXTENT_LAST
flag. Fix it.
Lustre-change: https://review.whamcloud.com/51863
Lustre-commit: TBD (from
264ab0c258adbce93d582e5e97f05ff7bf87c18a)
Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: trivial testlist=sanityn env=ONLY="71a 71b 71c"
Change-Id: I4556b31f0d04bdf8e83f323e83b871b093beaa5e
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52114
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Etienne AUJAMES [Thu, 9 Jun 2022 20:50:06 +0000 (22:50 +0200)]
LU-15926 nrs: fix tbf realtime rules
tc_nsecs_resid should be reset to 0 when changing a rule otherwise
this could lead to mds crashes for realtime policies.
nrs_tbf_req_get(): ASSERTION( cli->tc_nsecs_resid < cli->tc_nsecs )
Lustre-change: https://review.whamcloud.com/47585
Lustre-commit:
530861b344e46bef51c80adac4640c4586d8463a
Fixes:
d11fa2c27959 ("LU-9228 nrs: TBF realtime policies under congestion")
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Signed-off-by: Xing Huang <hxing@ddn.com>
Change-Id: I280acb42e104088c6b8750a0bb7bf9c50cf96e73
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52115
Reviewed-by: Qian Yingjin <qian@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Sebastien Buisson [Tue, 4 Jul 2023 07:28:37 +0000 (09:28 +0200)]
LU-16760 utils: support 'lfs find --attrs' and '-printf %La'
Add support to "lfs find" to filter on file attribute flags, with the
syntax "[!] --attrs=[^]ATTR[,...]".
Add support to "lfs find" to print file attribute flags with
"-printf %La".
Lustre-change: https://review.whamcloud.com/51562
Lustre-commit:
f0ab3ac6d6e31472c20ef538b799b96a512087f7
Add sanity-sec test_65 for Encrypted and Immutable flags.
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I5e5cfe5c8c8cbed8bb79f3ad6d8116347ecfe6ac
Reviewed-by: Zhenyu Xu <bobijam@hotmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52067
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Alex Zhuravlev [Fri, 25 Aug 2023 11:01:53 +0000 (14:01 +0300)]
EX-8150 tests: hot-pools doesn't need yq
just a single test using yq for the trivial check..
we don't really need yq, IMO.
Test-Parameters: fortestonly testlist=hot-pools
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Ib11bd85938646ba1387d26e0d39cc54dcfe04bf0
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52092
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Sebastien Buisson [Mon, 21 Aug 2023 09:44:32 +0000 (11:44 +0200)]
LU-17043 enc: fix osd lookup cache for long encrypted names
Fix osd lookup cache to support files with long encrypted names.
Those encrypted names can be up to 256 bytes, not NUL terminated.
Lustre-change: https://review.whamcloud.com/52016
Lustre-commit: TBD (from
51a526bfa61bb5391a7ac33108e264f590cd3f0c)
Fixes:
07a7befdc1 ("LU-16405 osd: lookup cache")
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Ica2329c8a0990395307a14fe9bb9d43db3b364ed
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52017
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Alex Zhuravlev [Mon, 3 Apr 2023 10:32:59 +0000 (13:32 +0300)]
LU-16405 osd: lookup cache
MDT may need to re-lookup just checked names (after locking).
introduce a trivial tiny per-thread cache in OSD in order to
make such a repeating lookup cheap.
the original issue is that ext4_add_entry() doesn't really
check for possible duplicate (that would be expensive as
a whole 4K block must be scanned).
important: the cache is reset upon request processing completion as
we don't update iversion on a disk (due to conflict with VBR).
Lustre-change: https://review.whamcloud.com/50521
Lustre-commit:
29f8eb2a67ba2806d91d93de1e82e05a63f76382
Fixes:
79acb9a9e7 ("LU-10235 mdt: mdt_create: check EEXIST without lock")
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I40c3ee702f7895c3bda00b380f904cd587e0a1c4
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51809
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Serguei Smirnov [Thu, 13 Jul 2023 00:29:56 +0000 (17:29 -0700)]
LU-16393 o2iblnd: add IBLND_REJECT_EARLY reject reason
Add IBLND_REJECT_EARLY reason for rejecting connection request:
to be used when the device doesn't have any nets added yet or
when there's no active NIs on the net to handle the connection.
These conditions are supposed to occur only when LNI is being
added/initialized, so report at CNETERROR level vs. CERROR.
In lnet, set NI state to ACTIVE only after it has been added
to the list of NIs for the net, so that LND can know that
the NI can be used to accept connections.
Lustre-change: https://review.whamcloud.com/51651
Lustre-commit:
673ff86a84ad5d11cde24aa7411c45385ad1c633
Test-parameters: trivial
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: I59efb2fdf5d5ceabb6ff23f638ec85da82d57b99
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52015
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Jian Yu [Thu, 10 Aug 2023 07:35:22 +0000 (00:35 -0700)]
LU-16943 tests: use primary ost1 server in replay-single/135
This patch fixes replay-single test_135() to make sure
the primary ost1 server is used at the beginning of the test.
Test-Parameters: trivial testlist=replay-single
Test-Parameters: trivial env=FAILURE_MODE=HARD \
clientcount=4 mdtcount=1 mdscount=2 osscount=2 \
austeroptions=-R failover=true iscsi=1 \
testlist=replay-single,mmp
Fixes:
18a424a0db1d ("LU-16943 tests: fix replay-single/135 under hard failure mode")
Change-Id: Ia25314255c9f00ba71687e1f757517f37031caed
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51913
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Vitaliy Kuznetsov [Fri, 28 Jul 2023 20:31:29 +0000 (00:31 +0400)]
LU-16298 ldiskfs: Periodically write ldiskfs superblock
This patch introduces a mechanism to periodically check and update
the superblock within the ext4 file system. The main purpose of this
patch is to keep the disk superblock up to date. The update will be
performed if more than one hour has passed since the last update,
and if more than 16MB of data have been written to disk.
This check and update is performed within the
ext4_journal_commit_callback function, ensuring that the superblock
is written while the disk is active, rather than based on a timer
that may trigger during disk idle periods.
Lustre-change: https://review.whamcloud.com/51340
Lustre-commit:
e27a7b33d6351ff8b8bae101079af88f4eedac99
Signed-off-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Change-Id: I06eb9624b663a6ca6b15c6af2373b82f1bb63de6
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51717
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Andreas Dilger [Fri, 7 Jul 2023 19:57:52 +0000 (13:57 -0600)]
LU-16872 tests: exercise sanity test_27M more fully
Improve the sanity.sh test_27M to precreate a bunch of files with
specific OST striping so that it is more likely to trigger the code
path that accessed the stale OST list when using O_APPEND layout.
Also clean up code style in the rest of this subtest.
Lustre-change: https://review.whamcloud.com/51602
Lustre-commit:
7bb1685048bf999df03ceadab39faa09b8a5560d
Test-Parameters: trivial testlist=sanity env=ONLY=27M,ONLY_REPEAT=200
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ie94e3a32fc48198e4e15f44a55d1f8ccf61c74f5
Reviewed-by: Thomas Bertschinger <bertschinger@lanl.gov>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52013
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Thomas Bertschinger [Fri, 7 Jul 2023 14:57:40 +0000 (10:57 -0400)]
LU-16872 lod: reset llc_ostlist when using O_APPEND stripes
Files created with O_APPEND can have special striping set with the
parameters mdd.*.append_stripe_count and mdd.*.append_pool, and
should not inherit a list of OSTs to use from a parent directory
when these parameters are set. However, if a file is created with
O_APPEND and its create is handled by a kernel thread that has
previously created a file with a default list of OSTs, then those
defaults were erroneously applied to the O_APPEND file. This can
lead to the create returning EINVAL or to a crash.
This commit ensures that llc_ostlist is cleared when a file is
created with special append stripes.
Lustre-change: https://review.whamcloud.com/51559
Lustre-commit:
766b35a9700f36aa08b652fa9d18b890d34bf4a5
Signed-off-by: Thomas Bertschinger <bertschinger@lanl.gov>
Change-Id: Ib2023e17c9ef31a2e029e09e67b257eb2c77b113
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52012
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Patrick Farrell [Tue, 22 Aug 2023 01:38:36 +0000 (21:38 -0400)]
Revert "LU-16651 llite: hold invalidate_lock when
invalidate cache pages"
This reverts patch
4debbda73f because of the hang issues
documented in NVDA-182.
Lustre-change: https://review.whamcloud.com/50371
Lustre-commit:
bba59b1287c9cd8c30a85fafb4fd5788452bd05c
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I5a7b67b79d9594f9f03150acca4fc542c09c0798
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52024
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Bobi Jam [Wed, 2 Aug 2023 11:30:09 +0000 (19:30 +0800)]
LU-16958 llite: call truncate_inode_pages() in inode lock
In some cases vvp_prune()->truncate_inode_pages() is get called
without IO context, we need protect it with inode lock as well.
So we add ll_inode_info::lli_inode_lock_owner and set it according to
vfs lock rules (Documentation/filesystems/Locking or
Documentation/filesystems/locking.rst), so before calling
truncate_inode_pages(), we'd lock the inode if it's not locked in
vfs.
And in lov_conf_set(), when it requires inode lock, we'd take heed of
the possible inode size lock, inode layout lock and lov conf lock that
have been taken by itself, and it also need to take these locks in
order lest deadlock being ensued.
Lustre-commit:
51d62f2122fee14fbb3ff8333b5a830e1181e4e5
Lustre-change: https://review.whamcloud.com/50857
Lustre-commit:
8f2c1592c3bbd0351ab3984a88a3eed7075690c8
Lustre-change: https://review.whamcloud.com/51641
Fixes:
ef9be34478 ("LU-16637 llite: call truncate_inode_pages() under inode lock")
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I7ee58039a6d31daefc625ac571a52baf112f8151
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51644
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Andreas Dilger [Mon, 21 Aug 2023 08:53:50 +0000 (02:53 -0600)]
RM-620 build: New tag 2.14.0-ddn98
New tag 2.14.0-ddn98
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I21ed1babb45f0b7f66e32fa4cfd425fab1da29b3
Andreas Dilger [Mon, 21 Aug 2023 08:53:26 +0000 (02:53 -0600)]
RM-620 build: New tag lipe-2.28
New tag lipe-2.28
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I40ca1c408e11bed1e7ddfcf700930c0ffe9df0fc
Vitaly Fertman [Fri, 31 Mar 2023 18:04:44 +0000 (21:04 +0300)]
LU-15535 llite: deadlock on lli_lsm_sem
it may happen that one process is doing lookup, and after reply while
holding the LDLM lock is trying to update LSM/default LSM under the
write lli_lsm_sem for a dir.
another process has taken the read lli_lsm_sem (taken for all the MD
ops in ll_prep_md_op_data()) and is waiting for a conflicting PW LDLM
lock on server for its modification for this dir.
it may happen on restriping with LSM, on changing the default LSM, but
even more often way is racer run even without striped dirs:
- racer does LFS mkdir -i $i <subdir> per each MDS, what creates a default
LSM on these subdirs inherited endlessly - to keep the MDS index;
- racer also does mkdir -p <path>, in which case we do:
ll_new_node - create a parent dir, no RMF_DEFAULT_MDT_MD in reply
ll_lookup parent it=open - no RMF_DEFAULT_MDT_MD in reply
ll_new_node - create a child
the default LSM is inherited on the parent creation, however as those RPCs
do not have lookup LDLM lock and no data - the default layout is not set
for the parent in inode at the time of a child creation. thus a parallel
lookup which gets the LSM deadlocks with this ll_new_node().
at the same time, similar to CLIO, we do not need to hold a sem nor an
LDLM lock over the whole operation to avoid LSM modification on server,
we just need to take an uptodate LSM (this is a subject for LU-16320)
and to guarantee this op will be working on the client on this LSM for
the whole operation.
the solution is to let MD ops to work on a copy of LSM therefore letting
others to modify LSM attached to inode in parallel if needed.
Lustre-change: https://review.whamcloud.com/50489
Lustre-commit:
3ebc8e0528e34a11ffeff1e6be347de18b248069
HPE-bug-id: LUS-10725
Signed-off-by: Vitaly Fertman <vitaly.fertman@hpe.com>
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I3137300b5bcce2e890994ce8751cdf7fce2f3f54
Reviewed-on: https://es-gerrit.hpc.amslabs.hpecorp.net/161525
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51828
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>