Whamcloud - gitweb
fs/lustre-release.git
7 months agoNew tag 2.15.58 2.15.58 v2_15_58
Oleg Drokin [Fri, 1 Sep 2023 20:38:40 +0000 (16:38 -0400)]
New tag 2.15.58

Change-Id: I6d58a43d5904c24d32575b4790bcaabd9ebdfb6f
Signed-off-by: Oleg Drokin <green@whamcloud.com>
7 months agoLU-17038 tests: remove unused compile.sh script 54/52054/2
Timothy Day [Wed, 23 Aug 2023 16:26:41 +0000 (16:26 +0000)]
LU-17038 tests: remove unused compile.sh script

This script just runs make automatically. It doesn't
appear to be called by any other Lustre sanity
test script. I doubt it has been used in many
years. This patch removes it.

Checked for usage using:

 `git grep -i "compile.sh"`

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: If1615196bc8d004a63ad8baddd1d3fe3e360dc74
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52054
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 months agoLU-17038 tests: remove mlink utility 51/52051/4
Timothy Day [Wed, 23 Aug 2023 02:52:32 +0000 (02:52 +0000)]
LU-17038 tests: remove mlink utility

The mlink utility is nearly identical to the link utility
provided by coreutils. They only differ by some GNU
boilerplate. All tests using mlink are replaced with link.
Luckily, mlink is only used in a few places.

Used the following command:

 `git grep -i mlink | grep -i -v symlink`

to track down all uses of mlink.

Test-Parameters: trivial testlist=recovery-small
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I197235572d2cb267ee68930c64058e4f5ffe5be1
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52051
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 months agoLU-12678 lnet: discard lnet_kvaddr_to_page 41/52041/3
Mr NeilBrown [Wed, 23 Aug 2023 00:18:41 +0000 (20:18 -0400)]
LU-12678 lnet: discard lnet_kvaddr_to_page

This function is not needed, so discard it.

Change-Id: Iffe9745adf477a5f4b78d8ef191849179426cb07
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52041
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 months agoLU-17043 enc: fix osd lookup cache for long encrypted names 16/52016/2
Sebastien Buisson [Mon, 21 Aug 2023 09:44:32 +0000 (11:44 +0200)]
LU-17043 enc: fix osd lookup cache for long encrypted names

Fix osd lookup cache to support files with long encrypted names.
Those encrypted names can be up to 256 bytes, not NUL terminated.

Fixes: 29f8eb2a67 ("LU-16405 osd: lookup cache")
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Ica2329c8a0990395307a14fe9bb9d43db3b364ed
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52016
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 months agoLU-15367 llite: iotrace standardization 02/52002/3
Patrick Farrell [Fri, 18 Aug 2023 18:31:32 +0000 (14:31 -0400)]
LU-15367 llite: iotrace standardization

Clean up and standardize some of the iotrace messages for
easier parsing.

Add a clear 'START' indicator.

Remove a now-redundant debug message in the mmap code.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ia620cc8c783509cbc3f47b21a274d67d860b80e7
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52002
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
7 months agoLU-17039 build: cleanup ib_dma_map_sg 79/51979/2
Shaun Tancheff [Fri, 18 Aug 2023 04:50:56 +0000 (23:50 -0500)]
LU-17039 build: cleanup ib_dma_map_sg

CONFIG_INFINIBAND_VIRT_DMA is a kernel configuration option
that in some cases conflicts with the configuration of the
externally provided OFED stack.

During configure when ib_dma_map_sg fails to build correctly
we can simply #undef CONFIG_INFINIBAND_VIRT_DMA to resolve
the inconsistent configuration that breaks ib_dma_map_sg

Test-Parameters: trivial
HPE-bug-id: LUS-11771
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Id0849464d3ffbd573cac13016191d80c6ea991af
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51979
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 months agoLU-17038 tests: remove munlink utility 77/51977/4
Andreas Dilger [Thu, 17 Aug 2023 22:06:36 +0000 (16:06 -0600)]
LU-17038 tests: remove munlink utility

The munlink utility is obsoleted by the unlink command added in
the coreutils package many moons ago, and can be removed.  All
tests using munlink are replaced with unlink.

Test-Parameters: trivial testlist=recovery-small,replay-dual,replay-single
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I984406525ed958814bd8af74a2d81c4920e320b0
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51977
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 months agoLU-16510 build: check if CONFIG_FORTIFY_SOURCE is defined 73/51973/2
Jian Yu [Thu, 17 Aug 2023 20:46:38 +0000 (13:46 -0700)]
LU-16510 build: check if CONFIG_FORTIFY_SOURCE is defined

The linux/fortify-string.h header file should not be
included while the kernel config option CONFIG_FORTIFY_SOURCE
is not defined.

Change-Id: I2e1905406e892b182f143d512a2d3722b141e52d
Fixes: 919b93b951d4 ("LU-16510 build: fortified memcpy from linux 6.1")
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51973
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
7 months agoLU-17036 utils: make sure resize option is legit 70/51970/2
Li Dongyang [Thu, 17 Aug 2023 13:27:00 +0000 (23:27 +1000)]
LU-17036 utils: make sure resize option is legit

To align the metadata on 1MB boundaries we manually
set the resize blocks to 16368G for 4K block size,
however mke2fs expects the resize blocks is bigger
than device size.

For devices between 16368G and 16384G the mke2fs
will fail with:
The resize maximum must be greater than the filesystem size.

Change-Id: I4567a79c1405e9527d7f0f9bec4c8a7aae0eba6c
Test-Parameters: trivial
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51970
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 months agoLU-17031 build: fix refefine __compiletime_strlen error 53/51953/2
Qian Yingjin [Wed, 16 Aug 2023 02:11:39 +0000 (22:11 -0400)]
LU-17031 build: fix refefine __compiletime_strlen error

Lustre build failed on Ubuntu 2204 kernel v5.17 with "redefine
__compiletime_strlen".
This patch fixes this build error.

Fixes: 919b93b951 ("LU-16510 build: fortified memcpy from linux 6.1")
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: Ic26daecd6b91614e01b5b0030f40eede205a21f7
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51953
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 months agoLU-17030 llite: allow setting max_cached_mb to a % 52/51952/7
Patrick Farrell [Tue, 15 Aug 2023 23:08:12 +0000 (19:08 -0400)]
LU-17030 llite: allow setting max_cached_mb to a %

Lustre's max_cached_mb parameter is hard to use because it
must be set to a specific numeric value, so in effect it
cannot be set on the server side unless all clients are
guaranteed identical.

Let's add the ability to set that to a % of memory to make
it more useful.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I1f9f5a8a5d671ab00b7ab6133bb9b1d1214ca59e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51952
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 months agoLU-10885 docs: note flock now being enabled by default 48/51948/2
Laura Hild [Tue, 15 Aug 2023 17:04:37 +0000 (13:04 -0400)]
LU-10885 docs: note flock now being enabled by default

mount -o flock was made the default, but the mount.lustre(8) man-page
still said noflock is default.  Text based on comments in LU-10885 and
http://wiki.lustre.org/Mounting_a_Lustre_File_System_on_Client_Nodes.

Signed-off-by: Laura Hild <lsh@jlab.org>
Change-Id: I48bfc0260fb948771f5cf4fb8cbc6ee9588e2217
Test-Parameters: trivial
Fixes: 16fb13eb3863 ("LU-10885 llite: enable flock mount option by default")
Fixes: 3613af3e15cb ("LU-10885 llite: enable flock mount option by default")
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51948
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 months agoLU-17015 gss: support large kerberos token on client 46/51946/6
Aurelien Degremont [Tue, 15 Aug 2023 14:03:07 +0000 (16:03 +0200)]
LU-17015 gss: support large kerberos token on client

If the current Kerberos setup is using large token, like
when PAC feature is enabled for Kerberos, client can crash.

Return an error instead of asserting to avoid the crash
and increase the default buffer size to 4kB instead of 1kB.
This will only increase the SEC_CTX_INIT request size, and
the buffer is shrunk before being sent over the wire.

This will allow security token up to 2kB to be properly
handled by Lustre. Above that size, a different issue will
happen on server side that will require another patch.

Test-Parameters: trivial kerberos=true testlist=sanity-krb5
Signed-off-by: Aurelien Degremont <adegremont@nvidia.com>
Change-Id: I9ce30ee7f8c95bfe41525c49986ffac45ffac97c
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51946
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
7 months agoLU-17006 lnet: set up routes for going across subnets 21/51921/4
Serguei Smirnov [Fri, 11 Aug 2023 00:58:11 +0000 (17:58 -0700)]
LU-17006 lnet: set up routes for going across subnets

Modify ksocklnd-config to set up route which features
default gateway for the subnet in case if default gateway
is defined, for example:
        ip route add default via <gw_for_eth0> dev eth0 table eth0
which results in a route similar to the following added to
the eth0 route table:
        default via <gw_for_eth0> dev eth0

If there's no gateway found for the eth0 subnet, keep the old
behaviour which results in the following added to eth0
route table:
        <eth0_subnet> dev eth0 proto kernel scope link src <eth0_ip>

This makes sure that MR traffic goes out the intended interface
as selected by LNet no matter whether going across subnets or not.

Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: I84a299c8b7eb4cdb4fc24408a1e42ad0283d9219
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51921
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 months agoLU-16766 obdclass: trim kernel thread names in jobids 19/51919/2
Thomas Bertschinger [Thu, 13 Jul 2023 22:32:52 +0000 (18:32 -0400)]
LU-16766 obdclass: trim kernel thread names in jobids

When collecting jobstats on operations coming from kernel threads, it
is more useful and reduces the noisiness of the data if the names of
kernel threads are trimmed so that all "kworker/CPU:ID" threads are
collected under "kworker", all "ll_sa_PID" threads under ll_sa, etc.

Signed-off-by: Thomas Bertschinger <bertschinger@lanl.gov>
Change-Id: Icd82a99c1153de0277ea5ed3f4b1d92535809921
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51919
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 months agoLU-17020 kernel: update RHEL 9.2 [5.14.0-284.25.1.el9_2] 86/51886/4
Jian Yu [Tue, 8 Aug 2023 22:43:03 +0000 (15:43 -0700)]
LU-17020 kernel: update RHEL 9.2 [5.14.0-284.25.1.el9_2]

Update RHEL 9.2 kernel to 5.14.0-284.25.1.el9_2.

Test-Parameters: trivial fstype=ldiskfs \
clientdistro=el9.2 serverdistro=el9.2 testlist=sanity

Test-Parameters: trivial fstype=zfs \
clientdistro=el9.2 serverdistro=el9.2 testlist=sanity

Change-Id: Icdbd9cfa18a72d3e6f09f366952e6e0f2ac1ebd2
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51886
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 months agoLU-17013 lov: fill FIEMAP_EXTENT_LAST flag 63/51863/9
Lei Feng [Thu, 3 Aug 2023 09:44:15 +0000 (17:44 +0800)]
LU-17013 lov: fill FIEMAP_EXTENT_LAST flag

If file has N extents and get the fiemap with exactly N
extent slots, the last extent will miss FIEMAP_EXTENT_LAST
flag. Fix it.

Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: testlist=sanityn env=ONLY=71a+71b+71c
Change-Id: I4556b31f0d04bdf8e83f323e83b871b093beaa5e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51863
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Zhenyu Xu <bobijam@hotmail.com>
7 months agoLU-17011 utils: monotonic clock in lfs mirror 52/51852/4
Alex Zhuravlev [Wed, 2 Aug 2023 10:31:57 +0000 (13:31 +0300)]
LU-17011 utils: monotonic clock in lfs mirror

use monotonic clocks instead of realtime to avoid affecting
bandwidth or hanging the transfer if the clock is changed.

Test-Parameters: trivial
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I58cf327d235448e93fa2ed63cefdf4dd01306e71
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51852
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
7 months agoLU-17009 tests: fix runtests to read file name with backslash 47/51847/2
Jian Yu [Wed, 2 Aug 2023 07:16:04 +0000 (00:16 -0700)]
LU-17009 tests: fix runtests to read file name with backslash

If a file in /etc dir has a name with backslash, then runtests
will fail because the read command considers the backslash as
an escape character. This patch fixes the issue by adding "-r"
option to read.

Change-Id: Iab912ba9708f5b64e6bb8d8adc266ff23ed32de5
Test-Parameters: trivial testlist=runtests
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51847
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sarah Liu <sarah@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 months agoLU-17000 lnet: remove redundant errno check in liblnetconfig.c 46/51846/3
Jake McManus [Thu, 10 Aug 2023 03:12:03 +0000 (23:12 -0400)]
LU-17000 lnet: remove redundant errno check in liblnetconfig.c

Variable root is assigned NULL at the beginning of
lustre_lnet_show_stats(). If l_ioctl() fails, its return value
stored in rc will take the True path in the following conditional.
This conditional currently contains a redundant check for errno,
despite the fact that rc would = -errno in this case. If errno had
changed between the l_ioctl() call and this subsequent read, errno
could be 0, which would, from the out: label, lead to a NULL
root being used as a parameter in cYAML_insert_sibling() and
dereferencing the NULL root pointer.

Replaced l_errno's use as a parameter in strerror with -rc, and
removed decleration and other references to l_errno.

Addresses-Coverity-ID: 397850 ("Explicit null dereferenced")

Signed-off-by: Jake McManus <jacobpmcmanus@gmail.com>
Change-Id: I78f080837b60c8216c52bda8562d4c0f9f45a132
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51846
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 months agoLU-16866 tests: Use wait_update to check LNet recovery state 45/51845/4
Chris Horn [Mon, 31 Jul 2023 19:03:57 +0000 (13:03 -0600)]
LU-16866 tests: Use wait_update to check LNet recovery state

The monitor thread is somtimes woken up on demand and sometimes sleeps
for one second intervals. This makes it hard to precisely predict how
long we need to sleep for ping counts to update and NIs to be
processed out of recovery.
Use wait_update when checking LNet recovery queues and ping counts.
Additional drop rules are added to tests 210 and 211 because it has
been observed that other test instances may issue pings to the node
running 210/211 and cause the ping_count to reset. These additional
drop rules ensure that any incoming messages are dropped.

Test-Parameters: trivial
Test-Parameters: testlist=sanity-lnet env=ONLY=210,211,216
Test-Parameters: testlist=sanity-lnet env=ONLY=211,ONLY_REPEAT=100
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Ief84388222e46c23952af4ad1d098924e73a8598
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51845
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 months agoLU-17000 misc: remove Coverity annotations 93/51793/2
Timothy Day [Fri, 28 Jul 2023 04:49:50 +0000 (04:49 +0000)]
LU-17000 misc: remove Coverity annotations

These Coverity function annotations were added
around 10 years ago. Since then, Coverity seems
to produce less false positives. Out of about 20
annotations, only 3 warnings get surpressed.
Thus, the applicability of these annotations
should be re-evaluated.

Coverity has more advanced tools now for reducing
false positives. Various Lustre functions and
macros could be modeled rather than using
function annotations. But first, we need to get
a good idea of what kinds of false postives are
being generated.

https://scan.coverity.com/tune

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Ibcb9cf55574675e20b13a4f7a1b9142a3b75e262
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51793
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 months agoLU-16984 tests: replay-dual/31 checks file from DIR2 62/51762/2
Lei Feng [Wed, 26 Jul 2023 00:52:10 +0000 (08:52 +0800)]
LU-16984 tests: replay-dual/31 checks file from DIR2

In replay-dual/test_31, check file existence from DIR2.
Add more messages for diagnosis.

Fixes: 07764c4eeb ("LU-16953 tests: wait longer in replay-dual/test_31")
Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: trivial testlist=replay-dual env=ONLY=31,ONLY_REPEAT=100
Change-Id: Iee679ee94ac2cb51baad1651bfaddf452fafdbd1
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51762
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 months agoLU-16961 clang: plugins and build system integration 59/51659/4
Timothy Day [Thu, 13 Jul 2023 04:19:41 +0000 (04:19 +0000)]
LU-16961 clang: plugins and build system integration

Clang has a plugin system. Compiler extensions can be created
by making a shared library and loading it via the "-fplugin"
options. This makes it simple to implement custom warnings
and static analyzers.

This patch adds a plugin to detect functions that should have
been made static. This plugin has been run over the majority
of the Lustre tree and patches have been submitted for all
warnings. The plugin did not return any false positives in
my testing.

It also add the "--enable-compiler-plugins" configure option,
which automatically builds and sets up the in-tree C compiler
plugins. The option force-enables the plugin regardless of
which compiler is in use. This behavior could be changed if
there is ever a need to support GCC specific plugins.

Also, add the configure checks needed to support building C++
in the Lustre tree. Clang and GCC plugins (and the compilers
themselves) are written in C++.

The license for the plugin mirrors that of the LLVM project
itself. This leaves the door open for contributing this
plugin upstream in the future. This isn't being upstreamed
now because it lacks any significant user community. Hence,
the plugin does not appear to meet the requirements for
upstreaming based on https://clang.llvm.org/get_involved.html.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I747ed91b53e765cc58e91a3eb9ec6c12b9908a96
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51659
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 months agoLU-16605 lfs: Add -n option to fid2path 26/51626/12
Arshad Hussain [Tue, 11 Jul 2023 05:55:36 +0000 (11:25 +0530)]
LU-16605 lfs: Add -n option to fid2path

Add '-n' option to fid2path to allow printing
only the filename of the file instead of the
whole parent pathname.

Test-case sanity/226d added.

Test-Parameters: trivial testlist=sanity
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: Ieebd39a1655b4e3ad20cdbb4941dbb44882845f4
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51626
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 months agoLU-16943 tests: fix replay-single/135 under hard failure mode 74/51574/6
Jian Yu [Wed, 12 Jul 2023 13:48:41 +0000 (21:48 +0800)]
LU-16943 tests: fix replay-single/135 under hard failure mode

This patch fixes replay-single test_135() to load libcfs module
on the failover partner node to avoid 'fail_val' setting error.
It also fixes the issue that not all of the OSTs are mounted after
failing back ost1.

Test-Parameters: trivial env=REPLAY_SINGLE_EXCEPT=200 testlist=replay-single
Test-Parameters: trivial env=REPLAY_SINGLE_EXCEPT=200 fstype=zfs testlist=replay-single

Test-Parameters: trivial env=REPLAY_SINGLE_EXCEPT=200,FAILURE_MODE=HARD \
    clientcount=4 mdtcount=1 mdscount=2 osscount=2 \
    austeroptions=-R failover=true iscsi=1 \
    testlist=replay-single

Change-Id: Id46c722a6db9d832829a739f41f7462b32a6d9d9
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51574
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 months agoLU-16936 auster: add --client-only option 09/51509/4
Timothy Day [Thu, 29 Jun 2023 15:37:21 +0000 (15:37 +0000)]
LU-16936 auster: add --client-only option

Add flag to auster to run sanity tests only on the
client-side. This leverages some existing functionality
to avoid having to setup ssh to filesystem hosts and
some other tedious setup.

Force test-framework.sh to honor the --no-setup flag.
Several test suites attempt to setup Lustre even if
auster says not to. Some lower level tests, like those
related to OBD device loading, require Lustre to be
not setup.

Change some [ to [[ in test-framework.sh to silence
some error messages.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I24de10743c3845b51fe29518ffc993b15a7c2cdd
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51509
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 months agoLU-16883 ldiskfs: update for ext4-delayed-iput for RHEL9.0 76/51376/2
Shaun Tancheff [Tue, 20 Jun 2023 07:31:53 +0000 (14:31 +0700)]
LU-16883 ldiskfs: update for ext4-delayed-iput for RHEL9.0

ext4-delayed-iput patch does not apply cleanly to RHEL9.0

Adjust the minor conflict in ext4_put_super()

Test-Parameters: trivial
Fixes: 616fa9b581 ("LU-15404 ldiskfs: use per-filesystem workqueues to avoid deadlocks")
HPE-bug-id: LUS-11661
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Ia8c2dcda50417b113399973f177a14283514a1ff
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51376
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 months agoLU-16896 flr: resync should not change file size 44/51344/6
Bobi Jam [Sat, 17 Jun 2023 00:51:26 +0000 (08:51 +0800)]
LU-16896 flr: resync should not change file size

mirror resync could punch a hole reaching the end of file in a
mirror, which could change the file size when the mirror is referred.

This patch calls truncate after punch in this case to keep the file
size unchanged in the mirror.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Ia0fc1f220a32a60f3516c69e86867796ae5c35c7
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51344
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 months agoLU-16906 build: Server for newer SUSE 15 SP3 kernels 38/51338/5
Shaun Tancheff [Tue, 15 Aug 2023 09:06:48 +0000 (04:06 -0500)]
LU-16906 build: Server for newer SUSE 15 SP3 kernels

Update the SUSE 15 SP3 server support for newer kernels
including LTSS series kernels.

Add a new ldiskfs patch series for updated SUSE 15 SP3
kernels with a updated ext4-pdirop.patch

Test-Parameters: trivial
HPE-bug-id: LUS-11676
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I0acf81abfcc71a64dc09a344a9231d86a44f193e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51338
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 months agoLU-16477 ldiskfs: Add ext4-enc-flag patch for SUSE 15 SP5 45/51945/3
Shaun Tancheff [Tue, 15 Aug 2023 12:40:50 +0000 (07:40 -0500)]
LU-16477 ldiskfs: Add ext4-enc-flag patch for SUSE 15 SP5

Include ext4-enc-flag for linux 5.14 in the 5.14 based SUSE 15 SP5
ldiskfs series.

Test-Parameters: trivial
HPE-bug-id: LUS-11442
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: If73c1665d5623f90d6908b049eb27755952b03f0
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51945
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
7 months agoLU-16821 llite: report 1MiB directory blocksize 60/50960/4
Andreas Dilger [Thu, 11 May 2023 17:49:52 +0000 (11:49 -0600)]
LU-16821 llite: report 1MiB directory blocksize

Report st_blksize=1048576 for directories so that glibc readdir()
will allocate a larger buffer to match the MDS_READDIR size
and reduce the number of syscalls for large dirs.

Signed-off-by: Andreas Dilger <adilger@dilger.ca>
Change-Id: If64057c20ecc35194c319d2a88c3036f12c41ed5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50960
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
7 months agoLU-16816 obdclass: make import_event more robust 15/50915/4
Sebastien Buisson [Wed, 10 May 2023 12:49:00 +0000 (14:49 +0200)]
LU-16816 obdclass: make import_event more robust

Make mdc_import_event and osc_import_event more robust, by not
assuming input variables can be dereferenced.

Test-Parameters: trivial
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I31a6477d58b7bb9a557ea561f7b0fa3fbcae5762
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50915
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 months agoLU-16232 script: fix the argument parse 76/50876/4
Yang Sheng [Sat, 6 May 2023 07:16:17 +0000 (15:16 +0800)]
LU-16232 script: fix the argument parse

The issue makes script skip other arguments if
the special parameter is not last one.

Test-Parameter: trival

Fixes: b533700add (LU-16232 scripts: changelog/updatelog emergency cleanup)
Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: Ia309e7b6f1a62e76b80851848601c3d0b03be8b2
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50876
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 months agoLU-9859 libcfs: discard cfs_gettok and cfs_str2num_check 44/50844/3
Mr NeilBrown [Thu, 24 Aug 2023 14:32:40 +0000 (10:32 -0400)]
LU-9859 libcfs: discard cfs_gettok and cfs_str2num_check

cfs_gettok() and cfs_str2num_check() are no longer used in the kernel,
so remove them.

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I49a8378f049a936a742681293db616f7eb9b11af
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50844
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 months agoLU-16552 test: add new lnet test for Multi-Rail setups 02/50302/19
James Simmons [Sun, 13 Aug 2023 15:02:33 +0000 (11:02 -0400)]
LU-16552 test: add new lnet test for Multi-Rail setups

You can crash lnet kernel module by setting up a interface with
lctl net up and then attempting to setup the interface with
the import function. This is due to improper clearing the net_cpts
array.

Currently sanity-lnet.sh doesn't real test MR setups. Because of
this a few bugs slipped in. Add two new test to ensure MR setups
behave properly. Test 107 is to see if deleting a second interface
for a MR setup doesn't crash a node. Test 108 creates a multi rail
setup of a tcp LNet net with two interfaces, one real and the
other fake. A bug was preventing the second fake interface from
being added.

Test-Parameters: trivial testlist=sanity-lnet
Change-Id: Ic69e14bd0617f4d6fe931140b5b6d43b795843cf
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50302
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 months agoLU-16374 ldiskfs: implement security.encdata xattr 56/49456/13
Sebastien Buisson [Tue, 20 Dec 2022 14:40:52 +0000 (15:40 +0100)]
LU-16374 ldiskfs: implement security.encdata xattr

security.encdata is a virtual xattr containing information related
to encrypted files. It is expressed as ASCII text with a "key: value"
format, and space as field separator. For instance:

   { encoding: base64url, size: 3012, enc_ctx: YWJjZGVmZ2hpamtsbW
   5vcHFyc3R1dnd4eXphYmNkZWZnaGlqa2xtbg, enc_name: ZmlsZXdpdGh2ZX
   J5bG9uZ25hbWVmaWxld2l0aHZlcnlsb25nbmFtZWZpbGV3aXRodmVyeWxvbmdu
   YW1lZmlsZXdpdGg }

'encoding' is the encoding method used for binary data, assume name
can be up to 255 chars.
'size' is the clear text file data length in bytes.
'enc_ctx' is encoded encryption context, 40 bytes for v2.
'enc_name' is encoded encrypted name, 256 bytes max.
So on overall, this xattr is at most 727 chars plus terminating '0'.

On get, the value of the security.encdata xattr is computed from
encrypted file's information.
On set, encrypted file's information is restored from xattr value.
The encrypted name is stored temporarily in a dedicated xattr
LDISKFS_XATTR_NAME_RAWENCNAME, that will be used to set correct name
at linkat.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Ia318c39d403b1c448e71bcd5b29862d022d05d0a
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49456
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 months agoLU-16235 hsm: check CDT state before adding actions llog 42/48842/6
Nikitas Angelinas [Tue, 12 Jul 2022 17:15:36 +0000 (20:15 +0300)]
LU-16235 hsm: check CDT state before adding actions llog

Don't allow HSM requests to be added to the actions llog when
cdt_state is in CDT_STOPPED/CDT_STOPPING as the CDT is unavailable, or
in CDT_INIT as any HSM requests in the llog may not have been fully
processed and so cdt_last_cookie may not have been set appropriately,
otherwise a colliding cookie value can be reused in
mdt_agent_record_add() and the assertions in
cdt_agent_record_hash_add() can be triggered:

"ASSERTION( carl0->carl_cat_idx == carl1->carl_cat_idx ) failed"
"ASSERTION( carl0->carl_rec_idx == carl1->carl_rec_idx ) failed"

Requests needed to implement the Remove Archive on Last Unlink (RAoLU)
policy are allowed when the CDT is shutdown, as those are safe
operations. They are also allowed during CDT initialization, even
though this can lead to the assertions being triggered, as doing so
maintains administrator expectations regarding file archives always
being removed when the RAoLU policy is enabled. This could possibly be
improved by e.g. failing when mdt_handle_last_unlink() is not able to
add an HSM remove request, or saving the requests in an llog so they
can be sent if the CDT is available later.

For the same reason, the llog needs to be processed before setting
cdt_state to CDT_RUNNING in the coordinator thread.

Change-Id: I4b5f5ee22f74827b31d8ed5917a8fc16e35d1f16
Signed-off-by: Nikitas Angelinas <nikitas.angelinas@hpe.com>
HPE-bug-id: LUS-8231, LUS-11064
Fixes: e26d7cc3 ("LU-14399 hsm: process hsm_actions in coordinator")
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48842
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Etienne AUJAMES <eaujames@ddn.com>
7 months agoLU-15526 mdt: enable remote PDO lock 33/46733/5
Lai Siyao [Fri, 7 Jul 2023 15:06:02 +0000 (11:06 -0400)]
LU-15526 mdt: enable remote PDO lock

Once parent directory is located on remote MDT, enqueue two locks like
local PDO lock if it's locked in LCK_PW mode. With this change,
creating directories (either local or remote) under one directory will
hardly trigger commit-on-sharing (unless their PDO hashes equal).

Updated sanityn 33c.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I28030c45fbf137f5912863ae5eacfc8372db6754
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/46733
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 months agoLU-13730 tests: add file mirroring to racer 68/41368/3
Andreas Dilger [Fri, 29 Jan 2021 21:01:05 +0000 (14:01 -0700)]
LU-13730 tests: add file mirroring to racer

Add "lfs mirror extend" to racer to add mirrors to existing files.

Test-Parameters: trivial testlist=racer,racer,racer
Test-Parameters: fstype=zfs testlist=racer,racer,racer
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Iaa64ed2de54533838ce955f88a1be592923ebbe5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/41368
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Zhenyu Xu <bobijam@hotmail.com>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 months agoLU-14361 statahead: add statahead advise IOCTL 25/48625/16
Qian Yingjin [Thu, 22 Sep 2022 09:24:14 +0000 (05:24 -0400)]
LU-14361 statahead: add statahead advise IOCTL

This patch reuse ioctl(LL_IOC_LADVISE2) for statahead advise.
This allows userspace programs to advise the kernel statahead
of the order that they will be traversing a directory, so that
the client can prefetch inode attributes from the MDT, similar
to what posix_fadvise(POSIX_FADV_SEQUENTIAL) does for file data.

After patched mdtest via adding this statahead IOCTL hint, it
can support mdtest benchmark with regularized file naming format:
mdtest.$rank.$i
The usage of this statahead advise IOCTL could be as follows:
open(dir);
ioctl(dir_fd, IOC_LADVISE2, ...);
stat mdtest.0.0;
stat mdtest.0.1;
stat mdtest.0.2;
stat mdtest.0.3;
...
clsoedir(dir);

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: Iac38e33bfc6d7a0b755c2646ba8053a263e3afc9
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48625
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 months agoLU-14156 utils: mirror split to check for last in-sync early 82/40782/47
Alex Zhuravlev [Fri, 27 Nov 2020 11:00:46 +0000 (14:00 +0300)]
LU-14156 utils: mirror split to check for last in-sync early

currently this check to prevent last in-sync component is done
once the file is open with O_RDWR which interrupts on-going
resync/extend process. instead we can do this check early once
the layout is fetched (after the first open with O_RDONLY).

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Iee08d23008b44d2a7b2127358116a95ace40b7dd
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/40782
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Zhenyu Xu <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
7 months agoLU-12645 llite: Move readahead debug before exit 32/51932/5
Patrick Farrell [Fri, 11 Aug 2023 22:01:26 +0000 (18:01 -0400)]
LU-12645 llite: Move readahead debug before exit

The core debug of ll_readahead() is before two return
conditions, which makes it really tricky to debug those
conditions.

Let's fix that.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ic3a3854527cad62c891c6a25029353a4742e555f
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51932
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 months agoLU-6142 lov: cleanup unneeded macros from lov_request.c 14/52014/2
Timothy Day [Sun, 20 Aug 2023 04:10:27 +0000 (04:10 +0000)]
LU-6142 lov: cleanup unneeded macros from lov_request.c

One macro defines a custom U64_MAX. The other adds
together two numbers, capping the sum at U64_MAX.
These macros are only used in a couple places. The
logic would be clearer and more concise without them.

Also, fix an incorrect comment.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I31012fbddba459df909c27cde8c59461f013c3be
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52014
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 months agoLU-6142 ptlrpc: Fix style issues for layout.c 26/51926/3
Arshad Hussain [Wed, 9 Aug 2023 04:30:01 +0000 (10:00 +0530)]
LU-6142 ptlrpc: Fix style issues for layout.c

This patch fixes issues reported by checkpatch
for file lustre/ptlrpc/layout.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: Ib482495ede6264dd3d42f90dbc50606487fd0b52
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51926
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 months agoLU-6142 ptlrpc: Fix style issues for events.c 88/51888/4
Arshad Hussain [Tue, 8 Aug 2023 10:15:13 +0000 (15:45 +0530)]
LU-6142 ptlrpc: Fix style issues for events.c

This patch fixes issues reported by checkpatch
for file lustre/ptlrpc/events.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I8a49e3c9216a042ca157bde3b82a06918f3f6554
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51888
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 months agoLU-16847 ldiskfs: refactor code. 90/51390/6
Alexey Lyashkov [Tue, 20 Jun 2023 12:23:56 +0000 (15:23 +0300)]
LU-16847 ldiskfs: refactor code.

unused parameters should removed to reduce a stack usage.
iobuf is common struct in io path now.

Test-Parameters: trivial
HPe-bug-id: LUS-11645
Signed-off-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Change-Id: Ie4d68ff7548f049de8706ac5b0e3f62eb15a211a
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51390
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
7 months agoLU-16077 ptlrpc: Fix ptlrpc_body_v2 with pb_uid/pb_gid 22/51122/3
Etienne AUJAMES [Wed, 24 May 2023 13:26:27 +0000 (15:26 +0200)]
LU-16077 ptlrpc: Fix ptlrpc_body_v2 with pb_uid/pb_gid

ptlrpc_body_v2 and ptlrpc_body_v3 should have the same fields except
for jobid.

This patch fixes the debug request messages by printing request
uid/gid at the end. That way debugging tools can still parse message
for newer versions.

Fixes: 0544c10 ("LU-16077 tbf: pb_uid/pb_gid ptlrpc_body fields for TBF rules")
Test-Parameters: testlist=sanityn env=ONLY=77,ONLY_REPEAT=20
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Change-Id: I1faa13fa7c5b03bfeeb7cd75f7dbbfa8ca8ca941
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51122
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
7 months agoLU-16827 obdfilter: Fix obdfilter-survery/1a 35/51035/12
Arshad Hussain [Thu, 24 Aug 2023 05:58:14 +0000 (01:58 -0400)]
LU-16827 obdfilter: Fix obdfilter-survery/1a

local_node() under test-framework is used
to determine if the node is remote or local
local_node() returns "true" if the node is
local. Else for remote node it return "false"

This patch fixes obdfilter/1a test case which
which was making reverse logic call to
local_node() to determine remote/local node

This patch modifies local_node() to return
"true"/"false" instead of 0/1

This patch also replaces lctl with $LCTL

Test-Parameters: testlist=obdfilter-survey
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I7bcb483975ec46d9847e0050e5a1f22f68663c80
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51035
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 months agoLU-11457 osd-ldiskfs: scrub FID reuse 01/51601/7
Lai Siyao [Fri, 7 Jul 2023 09:21:05 +0000 (05:21 -0400)]
LU-11457 osd-ldiskfs: scrub FID reuse

It's possible that two inodes back point to the same FID, check
inodes in osd_scrub_check_update() to decide which mapping
should be kept:
* if one inode doesn't exist, its mapping is stale.
* if one inode mtime is after the other one, keep this mapping.
* if two inode mtimes equal, and one inode size is not 0, keep its
  mapping, otherwise two inode sizes are 0, just keep the existing
  mapping.

Remove IDIF support in osd_scrub_check_update() to simplify
code logic.

Add sanity-scrub 4e to verify it.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Ida020c2852c66f1a8910845bd16ab4c882858a4e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51601
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-16097 tests: skip quota subtests in interop 09/52009/11
Andreas Dilger [Fri, 18 Aug 2023 21:55:10 +0000 (21:55 +0000)]
LU-16097 tests: skip quota subtests in interop

Skip subtests in sanity-quota.sh to avoid interop test failures,
backdated to check all new tests since 2.14.0 for completeness.

Test-Parameters: trivial testlist=sanity-quota ossversion=2.15.3
Test-Parameters: testlist=sanity-quota mdsversion=2.15.3
Fixes: 513b1cdbca ("LU-16340 quota: notify only global lqe")
Fixes: d4978678b4 ("LU-15694 quota: keep grace time while setting default")
Fixes: 25a70a88c9 ("LU-13952 quota: default OST Pool Quotas")
Fixes: 188112fc80 ("LU-14300 quota: avoid nested lqe lookup")
Fixes: 8c19365416 ("LU-13971 quota: report Pool Quotas for a user")
Fixes: a4fbe7341b ("LU-14739 quota: nodemap squashed root cannot bypass quota")
Fixes: 3ffa5d680f ("LU-14740 llite: avoid project quota overflow")
Fixes: 29e00cecc6 ("LU-14696 llite: check read only mount for setquota")
Fixes: 789038c97a ("LU-15167 quota: fallocate send UID/GID for quota")
Fixes: 5fc934ebbb ("LU-15519 quota: fallocate does not increase projid usage")
Fixes: c9901b68b4 ("LU-13587 quota: protect qpi in proc")
Fixes: 61ec1e0f2c ("LU-15031 quota: reseed glbe in qmt_lvbo_udate")
Fixes: dfe7d2dd2b ("LU-16341 quota: fix panic in qmt_site_recalc_cb")
Fixes: 862f0baa7c ("LU-15097 quota: stop pool_recalc before killing pool")
Fixes: 61481796ac ("LU-15193 quota: expand QUOTA_MAX_TRANSIDS to 12")
Fixes: a2fd4d3aee ("LU-15880 quota: fix insane grant quota")
Fixes: 6c0b4329d0 ("LU-16339 quota: notify OSTs until lge_qunit_nu is set")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ife8bfd83d0f217c534f3b12b4c9d108d370ed6b7
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52009
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sergey Cheremencev <scherementsev@ddn.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-13306 mgs: support large NID for mgs_write_log_osc_to_lov 53/52053/5
James Simmons [Wed, 23 Aug 2023 14:46:34 +0000 (10:46 -0400)]
LU-13306 mgs: support large NID for mgs_write_log_osc_to_lov

The various llogs on the MGS needed to be updated to support both
64 bit NID size and the newer large NID format. The function
mgs_write_log_osc_to_lov was missed in this update.

Test-Parameters: trivial testlist=runtests ossversion=2.15.3
Fixes: c0cb747ebe9 ("LU-13306 mgs: use large NIDS in the nid table on the MGS")
Change-Id: If543a0421d1f3cac9827581ce46da911c3456efd
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52053
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
8 months agoLU-16541 tests: Improve test 64f 40/52040/5
Patrick Farrell [Tue, 22 Aug 2023 16:32:52 +0000 (12:32 -0400)]
LU-16541 tests: Improve test 64f

The buffered IO part of test 64f has several timing related
holes and other oddities.  The use of multiop in the
background does not guarantee the RPC will not be sent, AND
the test doesn't kill it correctly.

Clean this up and make a more reliable version of the test.
Hopefully this will resolve the failure issues, if not, a
better version of the test will allow debugging.

Test-Parameters: trivial
Test-Parameters: testlist=sanity envdefinitions=ONLY=64f,ONLY_REPEAT=20
Test-Parameters: testlist=sanity envdefinitions=ONLY=64f,ONLY_REPEAT=20
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I25b825e1d9d516635ef8cbd26dd12809625c34df
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52040
Reviewed-by: xinliang <xinliang.liu@linaro.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
8 months agoLU-17005 obdclass: allow stats header to be disabled 23/51823/2
Andreas Dilger [Mon, 31 Jul 2023 19:34:22 +0000 (13:34 -0600)]
LU-17005 obdclass: allow stats header to be disabled

Add a global "enable_stats_header" tunable parameter that can be
set to enable/disable the "start_time" and "elapsed_time" fields
in the standard lprocfs "stats" files.

Default to enabled, since this landed shortly after v2_14_0.

Test-Parameters: trivial
Fixes: 5efb892396e3 ("LU-11407 obdclass: add start time to stats files")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I460b957447bfb83e6d4fd7395b79ce994f3ebbe5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51823
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Feng Lei <flei@whamcloud.com>
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-16341 tests: skip sanity-quota/test_14 for old MDS 49/51949/2
Alex Deiter [Tue, 15 Aug 2023 18:47:51 +0000 (22:47 +0400)]
LU-16341 tests: skip sanity-quota/test_14 for old MDS

Skip sanity-quota test_14 for old MDS missing the fix
for LU-16341 kernel NULL in qmt_site_recalc_cb.

Fixes: d965d63415 ("LU-16341 quota: fix panic in qmt_site_recalc_cb")
Test-Parameters: trivial testlist=sanity-quota env=ONLY=14
Signed-off-by: Alex Deiter <adeiter@tintri.com>
Change-Id: I1a23daa06f0cd306c2b034df18617c2650945b28
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51949
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-17027 target: include linux/file.h 43/51943/2
Xinliang Liu [Tue, 15 Aug 2023 07:58:14 +0000 (07:58 +0000)]
LU-17027 target: include linux/file.h

In some 4.x kernels like 4.19 we need to include linux/file.h to
have alloc_file_pseudo() defined.

Change-Id: Ieee8d5ac5b080bd3b5c761f54a5ca2f9581ecfe1
Test-Parameters: trivial
Fixes: ac0380dc519a ("LU-137 osd-ldiskfs: pass through resize ioctl")
Signed-off-by: Xinliang Liu <xinliang.liu@linaro.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51943
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-16424 tests: Add version check in sanity-lnet 42/51942/4
Wei Liu [Mon, 14 Aug 2023 19:02:24 +0000 (12:02 -0700)]
LU-16424 tests: Add version check in sanity-lnet

Skip sanity-lnet test_205, test_207 and test_209 if
version is older than 2.14.58 since the lnet_if_list
function was added in Fixes:
3166a201e0 ("LU-15398 tests: Use remote peers for health tests")

Test-Parameters: trivial testlist=sanity-lnet \
serverjob=lustre-b2_14 serverbuildno=2 \
serverdistro=el8.3

Signed-off-by: Wei Liu <sarah@whamcloud.com>
Change-Id: I9cd62d91980784e3b33cf4e30426bf74d17f717f
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51942
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Colin Faber <cfaber@ddn.com>
8 months agoLU-16796 libcfs: Change struct cfs_hash to use kref 38/51938/3
Arshad Hussain [Fri, 11 Aug 2023 07:32:49 +0000 (13:02 +0530)]
LU-16796 libcfs: Change struct cfs_hash to use kref

This patch changes struct cfs_hash to use
kref(refcount_t) instead of atomic_t

Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I58b5e8311a34b3b128c1440b93958389b0fcdd48
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51938
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-16831 tests: add version check to sanity-pfl/0e 30/51930/2
Jian Yu [Fri, 11 Aug 2023 19:58:27 +0000 (12:58 -0700)]
LU-16831 tests: add version check to sanity-pfl/0e

This patch adds MDS version check to sanity-pfl test 0e
to avoid interop test failure.

Test-Parameters: trivial \
serverjob=lustre-b2_15 serverbuildno=67 \
env=ONLY=0e testlist=sanity-pfl

Test-Parameters: trivial env=ONLY=0e testlist=sanity-pfl

Change-Id: I79df1f36f07f6b376525364708eacc687f85a061
Fixes: a250ecb959a9 ("LU-16831 lfs: limit stripe count for component size")
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51930
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-16796 target: Change struct top_multiple_thandle to use kref 22/51922/2
Arshad Hussain [Thu, 10 Aug 2023 12:30:46 +0000 (18:00 +0530)]
LU-16796 target: Change struct top_multiple_thandle to use kref

This patch changes struct top_multiple_thandle to use
kref(refcount_t) instead of atomic_t

Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I5892e5ab14ea6570645e6395af6d8a0d2c325398
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51922
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-17018 build: add 'linux-image-generic' as Depends 79/51879/5
Raphael Druon [Mon, 7 Aug 2023 07:26:09 +0000 (01:26 -0600)]
LU-17018 build: add 'linux-image-generic' as Depends

Add 'linux-image-generic >= 3.10' as a dependency for Debian dkms
package for Ubuntu support

Test-Parameters: trivial
Fixes: 621e0bc2f9 ("LU-16661 build: improve lustre.spec.in Requires")
Signed-off-by: Raphael Druon <rdruon@ddn.com>
Change-Id: Ie8bacbd55c379632d5554de8d72606c818c1771e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51879
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
8 months agoLU-16796 ptlrpc: Change struct lsi_mounts to use kref 64/51864/4
Arshad Hussain [Mon, 31 Jul 2023 10:21:48 +0000 (15:51 +0530)]
LU-16796 ptlrpc: Change struct lsi_mounts to use kref

This patch changes struct lsi_mounts to use
kref(refcount_t) instead of atomic_t

Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: Ia185b19123f535f8c54a6ea6b7a0212fbe85ffea
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51864
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-17003 dne: remove REP-ACK support in DNE system 51/51851/3
Lai Siyao [Mon, 10 Jul 2023 22:49:50 +0000 (18:49 -0400)]
LU-17003 dne: remove REP-ACK support in DNE system

DNE system doesn't need to support REP-ACK. In the old implementation,
write locks are kept in PW|EX mode after transaction stop, and will
be downgraded to TXN mode till REP-ACK, and then not released until
transaction commit.

While in the period between transaction stop and REP-ACK, any read
lock request will be on hold till downgrade, with this change, this
read lock request will succeed immediately.  During this period, any
write lock request may involve extra commit, since mdt_blocking_ast()
does not know whether transaction has stopped, so it needs to trigger
commit-on-sharing immediately, and also set 'sync' flag in the lock.
If transaction is not stopped yet, later when it's stopped, it will
trigger another commit-on-sharing since the 'sync' flag is set.

With this change, mdt_blocking_ast() only needs to set 'sync' flag if
its mode is PW|EX, and trigger commit-on-sharing once upon unlock.
This refuces the number of transaction commits and may improve
performance in some corner cases.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I159a0ad619afd10e97be3dc175a6b4ed77b31142
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51851
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
8 months agoLU-16796 ptlrpc: Change struct ls_device to use kref 11/51811/6
Arshad Hussain [Mon, 31 Jul 2023 04:27:00 +0000 (09:57 +0530)]
LU-16796 ptlrpc: Change struct ls_device to use kref

This patch changes struct ls_device to use
kref(refcount_t) instead of atomic_t

Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: Iba3965ef884ef65ab2d379ed389dfbea4ef8a453
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51811
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
8 months agoLU-16999 lnet: Restore lpni aliveness check 91/51791/2
Chris Horn [Tue, 25 Apr 2023 18:53:46 +0000 (13:53 -0500)]
LU-16999 lnet: Restore lpni aliveness check

This is a revert of the following master change:

Lustre-change: https://review.whamcloud.com/46623/
Lustre-commit: caf6095ade66f70d4bad99ced7a918814a3af092

That patch restored the historic behavior of the LNet router peer
health feature, but it did not account for the fact that the old lnet
router checker behaved differently than the current implementation
that leverages LNet discovery to perform the router checker pings.
Because of this change to use discovery we can no longer guarantee
that each router end point will be ping'd within the peer aliveness
window, and as a result the router may incorrectly determine that some
peer NIs are not alive.

Revert this change until a long term solution can be found.

Test-Parameters: trivial testlist=sanity-lnet
HPE-bug-id: LUS-11604
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I77f4bd64b616693ab2c91c747bf327c6f71689c4
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51791
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-16998 lnet: Error when missing discover arg 90/51790/2
Chris Horn [Tue, 9 May 2023 20:05:21 +0000 (14:05 -0600)]
LU-16998 lnet: Error when missing discover arg

Print an error when a user does not supply a NID argument to the
'lnetctl discover' command.

Test-Parameters: trivial
HPE-bug-id: LUS-11487
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I081137db5490547a69248b7d2e7f7986b6d8612e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51790
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-15235 tests: skip sanity/56od in interop 80/51580/2
Andreas Dilger [Wed, 5 Jul 2023 20:07:52 +0000 (14:07 -0600)]
LU-15235 tests: skip sanity/56od in interop

Sanity test_56oc and test_56od were using the btime_supported()
function to check it "lfs find" supported file birth time, but
this did not properly check whether the MDS supported this option.

Remove the btime_supported() check and just use the version, since
this has been around a few releases already.

Fixes: 186b97e68abb ("LU-11971 utils: Send file creation time to clients")
Test-Parameters: trivial testlist=sanity serverversion=2.12.9 env=ONLY=56
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I0c85103c843d3b993e3e112bf5d0da976d3ebbe5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51580
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-16945 tests: skip sanity test_27Cg in interop 79/51579/2
Andreas Dilger [Wed, 5 Jul 2023 19:47:28 +0000 (13:47 -0600)]
LU-16945 tests: skip sanity test_27Cg in interop

Sanity test_27Cg is testing functionality that was broken in older
MDS versions, but does not have a version check, so it causes testing
to timeout 100% of the time when running on older servers.  Skip it.

Fixes: d96b98ee6b63 ("LU-16693 lod: ENODEV on setstripe with wrong OST#")
Test-Parameters: trivial testlist=sanity env=ONLY=27Cg
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I52b3d4e6a78a0db8f48401b128e22372f3d8a9bd
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51579
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-16760 utils: support 'lfs find --attrs' and '-printf %La' 62/51562/8
Sebastien Buisson [Tue, 4 Jul 2023 07:28:37 +0000 (09:28 +0200)]
LU-16760 utils: support 'lfs find --attrs' and '-printf %La'

Add support to "lfs find" to filter on file attribute flags, with the
syntax "[!] --attrs=[^]ATTR[,...]".
Add support to "lfs find" to print file attribute flags with
"-printf %La".

Add sanity-sec test_65 for Encrypted and Immutable flags.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I5e5cfe5c8c8cbed8bb79f3ad6d8116347ecfe6ac
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51562
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Zhenyu Xu <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
8 months agoLU-16360 osc: fix lu_ref usage 22/51522/2
Alexey Lyashkov [Fri, 2 Dec 2022 08:40:05 +0000 (11:40 +0300)]
LU-16360 osc: fix lu_ref usage

LDLM_LOCK_PUT should used with find lock by handle,
but LDLM_LOCK_RELEASE with get ref, let's fix it.

HPe-bug-id: LUS-11365
Test-Parameters: trivial
Fixes: 9c2fb0b29cec (LU-9679 osc: convert oe_refc to kref)
Signed-off-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Change-Id: Ib720b496b585c915ba20e0651a88c4afdde98e99
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51522
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-14697 tests: change performance-sanity to use mdtest 14/51414/22
Alex Deiter [Thu, 22 Jun 2023 13:28:48 +0000 (17:28 +0400)]
LU-14697 tests: change performance-sanity to use mdtest

Replace mdsrate by mdtest in performance-sanity.sh

Test-Parameters: trivial
Test-Parameters: testlist=performance-sanity clientdistro=el7.9
Test-Parameters: testlist=performance-sanity clientdistro=el8.8
Test-Parameters: testlist=performance-sanity clientdistro=el9.2
Test-Parameters: testlist=performance-sanity clientdistro=ubuntu2204
Signed-off-by: Alex Deiter <adeiter@tintri.com>
Change-Id: I1a80bab4ccbe085d3ff8d8b332c8e117e14ea9cb
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51414
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
8 months agoLU-13805 obd: Reserve unaligned DIO connect flag 75/51075/7
Patrick Farrell [Wed, 9 Aug 2023 16:16:25 +0000 (12:16 -0400)]
LU-13805 obd: Reserve unaligned DIO connect flag

Unaligned DIO generally requires only client changes, but
an assert must be removed from ZFS servers for it to work
correctly.  This means we need a connect flag to recognize
whether or not a server running ZFS can safely use
unaligned DIO.

All OSTs will present this flag - to keep things simple -
but if the flag is not present, we'll still do unaligned
DIO to ldiskfs OSTs.

Actual implementation will be in another patch, this one
just creates the flag itself.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I8b149cc54f4fb11e64182c65f2fbb01f8a3d3868
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51075
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-9859 lnet: simplifiy cfs_ip_addr_parse() 42/50842/9
Mr NeilBrown [Tue, 24 Nov 2020 23:10:14 +0000 (10:10 +1100)]
LU-9859 lnet: simplifiy cfs_ip_addr_parse()

cfs_ip_add_parse() is now always passed a string that it is safe to
modify.  So change the parsing to benefit from this and use standard
tools like strsep().

Note that the 'len' argument is now ignored.  It cannot be removed
without a larger change.

Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Ice492edf109dca2e411132b891514f0caa535d8c
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50842
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
8 months agoLU-10391 obdclass: handle large NIDs for mount strings 62/50362/17
James Simmons [Thu, 3 Aug 2023 20:57:02 +0000 (16:57 -0400)]
LU-10391 obdclass: handle large NIDs for mount strings

Mount strings support using ':' as a delimiter but this is also
a part of the some NID strings like IPv6, so rework class_parse_value()
to only look at ':' when it occurs after '@'.

The mount utilities use the function convert_hostnames() to ensure
the mount string containing an NID is valid. This only works for
small size nids so migrate the function to handle large NIDs. This
should allow mounting with IPv6 or other large NID addresses.

In testing the userland  libcfs_ip_str2addr_size() had bugs that
rendered incorrect NID strings. Fix those issues.

Fixes: b6c702df5d4 ("LU-10391 libcfs: add large-nid string conversion functions.")
Change-Id: Ic9b2a368456ba75ceb5911ac7f75ae00d6123870
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50362
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-16962 build: parallel header checks 73/51673/4
Shaun Tancheff [Fri, 14 Jul 2023 09:21:45 +0000 (16:21 +0700)]
LU-16962 build: parallel header checks

Add LB2_CHECK_LINUX_HEADER_SRC and LB2_CHECK_LINUX_HEADER_RESULT
macros to use for running header checks in parallel.

Migrate (most) header checks to parallel and run a subset
early as the results of those tests are required by other
configure tests.

Test-Parameters: trivial
HPE-bug-id: LUS-11710
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Ia765261179d25e96912e65e31c81824b4507e604
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51673
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
8 months agoLU-16957 build: Improve parallel --config-cache 37/51637/5
Shaun Tancheff [Fri, 14 Jul 2023 08:34:02 +0000 (15:34 +0700)]
LU-16957 build: Improve parallel --config-cache

The parallel build should consider the configure cache before
adding tests to the parallel build pass.

Track the number of compile tests needed, skip the make when
no build tests are needed.

Also unify libcfs, core, and ldiskfs build passes to a single step.

Configure timings vs master

     master       master w/cache  |     patch         patch w/cache
 --------------   --------------- | ---------------  ----------------
 real  1m3.493s   real  0m34.024s | real  1m3.903s    real  0m8.404s
 user 1m34.587s   user  1m16.547s | user  1m37.191s   user  0m4.292s
 sys  0m35.119s   sys   0m22.687s | sys   0m35.297s   sys   0m5.514s

Test-Parameters: trivial
HPE-bug-id: LUS-11706
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I6696b350e8315190a67c1463435b18a87d45813e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51637
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
8 months agoLU-13805 llite: Add copy of iovec to sub-dio 40/49940/28
Patrick Farrell [Wed, 8 Feb 2023 04:09:17 +0000 (23:09 -0500)]
LU-13805 llite: Add copy of iovec to sub-dio

It's very useful to move some of the direct I/O processing
from the main thread to the ptlrpc threads.  This is done
by associating the processing with each sub DIO, or DIO
'chunk'.  This requires a local copy of the iovec in each
chunk, because we:
A) need the chunk-current state of the iovec (as we move
along the iovec as chunks are created)
B) some operations will modify the iovec, and so to do
them from multiple ptlrpc threads, each needs to work on a
separate copy of the iovec.

This will be used by copy_page_to_iter in completing
unaligned DIO reads.

This has been split out from the main unaligned DIO patch
for simplification.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I5645e904b6f9423eafc69cc0f59349cb3dcb9920
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49940
Reviewed-by: Zhenyu Xu <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
8 months agoLU-13805 clio: Add write to sdio 91/49991/30
Patrick Farrell [Tue, 14 Feb 2023 18:33:35 +0000 (13:33 -0500)]
LU-13805 clio: Add write to sdio

Unaligned DIO will need to know if an sdio is a write or
a read, so we add this info to the sub-dio.

Test-parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I8ee042ca5a0461db672ba98b7fa6c64b01a8bba2
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49991
Reviewed-by: Zhenyu Xu <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
8 months agoLU-13805 tests: Add racing tests of BIO, DIO 29/50529/27
Patrick Farrell [Tue, 4 Apr 2023 18:53:08 +0000 (14:53 -0400)]
LU-13805 tests: Add racing tests of BIO, DIO

We're a bit short on racing tests for buffered IO and
direct IO.  This patch adds a number of tests.  These
were originally part of the unaligned DIO patch, but
some of them have shown issues without unaligned IO.

So this patch puts in these tests with only aligned IO so
we can see which failures are specific to the unaligned IO
changes and which are not.

This patch should be landable like this.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I861bcaec785936cb9c3752f8148dcab4054f6078
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50529
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
8 months agoLU-7073 tests: Add file migration to racer 69/13669/25
Henri Doreau [Fri, 6 Feb 2015 09:01:36 +0000 (10:01 +0100)]
LU-7073 tests: Add file migration to racer

Make racer run both blocking and non-blocking "lfs migrate" commands.
Implement this within the file_create.sh script, since it is already
selecting among different layout types.

Update Makefile.am to avoid listing every racer filename explicitly
to make it easier to add new types of operations in the future.

Test-Parameters: trivial testlist=racer,racer,racer
Test-Parameters: fstype=zfs testlist=racer,racer,racer
Signed-off-by: Henri Doreau <henri.doreau@cea.fr>
Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Change-Id: I51b3f19c78029ff47102e96a71ec4a0fc472183a
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/13669
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Zhenyu Xu <bobijam@hotmail.com>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-930 misc: update MAINTAINERS addresses 25/52025/3
Andreas Dilger [Sun, 26 Mar 2023 23:40:19 +0000 (17:40 -0600)]
LU-930 misc: update MAINTAINERS addresses

Update email addresses for various maintainers, and remove those
people who are no longer working on Lustre.

Change "M:" to "R:", since "M:" means "Mail to" and not "Maintainer"
as I thought.  Use "R:" for "Reviewer", and remove other tags that
we don't want in this file for Lustre.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I3e73963b08181154fa48f308cb3d1d0a533ebbe5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52025
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
8 months agoLU-16997 kfilnd: Correct RX buffer size 89/51789/2
Chris Horn [Thu, 27 Jul 2023 15:47:22 +0000 (09:47 -0600)]
LU-16997 kfilnd: Correct RX buffer size

The immediate receive buffers are large buffers where incoming
messages are byte packed into the buffer until all buffer space is
exhausted. The size of the buffer is kkfilnd module parameter
credits * 4096. The number of buffers is controlled by kkfilnd module
parameter immediate_rx_buf_count.

With the current defaults this results in only 1MiB of buffers per RX
context (i.e. CPT) to sync these messages. While kfilnd tries to
replenish these buffers as fast as possible, it may not be fast enough
and replenishing can be delayed based on CPU availability. Change
default credits to 512 so that we have have 8x 2MiB buffers.

Test-Parameters: trivial
HPE-bug-id: LUS-10548
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I83f813ba2e295e6087131dcdfb12fc0feebb4834
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51789
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Ron Gredvig <ron.gredvig@hpe.com>
Reviewed-by: Ian Ziemba <ian.ziemba@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-16996 kfilnd: Wrong traffic class assigned 88/51788/2
Ron Gredvig [Thu, 15 Dec 2022 20:04:20 +0000 (14:04 -0600)]
LU-16996 kfilnd: Wrong traffic class assigned

The wrong traffic class was being assigned to a transmit
context when multiple networks were assigned to the same
interface.

This was discovered by noticing a currently unsupported
traffic class didn't fail when it was used. The traffic
class from the shared domain was being used instead.

This was fixed by explicitly specifying the traffic
class when creating a transmit context for an endpoint.

Test-Parameters: trivial
HPE-bug-id: LUS-11415
Signed-off-by: Ron Gredvig <ron.gredvig@hpe.com>
Change-Id: I21236c01d4bef53b62e2f303c8e24e059ce83c0a
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51788
Reviewed-by: Ian Ziemba <ian.ziemba@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
8 months agoLU-16995 kfilnd: Handle TAG_RX_OK in TN_STATE_FAIL 87/51787/2
Chris Horn [Thu, 13 Apr 2023 15:36:37 +0000 (09:36 -0600)]
LU-16995 kfilnd: Handle TAG_RX_OK in TN_STATE_FAIL

It is possible for the fabric to delay packets such that the retry
handler cancels the message but it is still delivered to the target.
If the timing is right then the initiator may receive a TAG_RX_OK
event after the transaction has transitioned to TN_STATE_FAIL. This
currently trips an LBUG, but instead we can allow the transaction to
complete normally.

Test-Parameters: trivial
HPE-bug-id: LUS-11572
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I381d64713a7942fed09d41b30f64be602193057f
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51787
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Ron Gredvig <ron.gredvig@hpe.com>
Reviewed-by: Ian Ziemba <ian.ziemba@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-16994 kfilnd: Add work queue parameters 86/51786/2
Ron Gredvig [Thu, 20 Apr 2023 19:12:08 +0000 (19:12 +0000)]
LU-16994 kfilnd: Add work queue parameters

Added kfilnd work queue parameters to allow tuning.

The wq_high_priority parameter enables the work queue to run
as high priority. Default is enabled.

The wq_cpu_intensive parameter enables the work queue to run
as cpu intensive. Default is disabled.

The wq_max_active parameter sets the max in-flight work items
of the work queue. Default is 512.

The prov_cpu_exclusive parameter enables reserving one of a
CPT's CPUs for the exlusive use of the kfabric provider.
Default is disabled.

Test-Parameters: trivial
HPE-bug-id: LUS-11605
Signed-off-by: Ron Gredvig <ron.gredvig@hpe.com>
Change-Id: Ic4db95787a864efca3ea1234953ce3ea828f3594
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51786
Reviewed-by: Ian Ziemba <ian.ziemba@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
8 months agoLU-16993 kfilnd: Add count value to debugfs stats 85/51785/2
Ron Gredvig [Fri, 12 May 2023 19:44:48 +0000 (19:44 +0000)]
LU-16993 kfilnd: Add count value to debugfs stats

The debugfs stats for initialor and target include
min. max and average. It is hard to interprest the
average without knowing how many sample were included
to calculate it.

Added the count value for extra context.

Test-Parameters: trivial
HPE-bug-id: LUS-11627
Signed-off-by: Ron Gredvig <ron.gredvig@hpe.com>
Change-Id: I3d840c250653b4f29b40c169254b9c9b4c88f584
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51785
Reviewed-by: Ian Ziemba <ian.ziemba@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
8 months agoLU-16992 kfilnd: Expand debugfs to record min/max 84/51784/2
Ian Ziemba [Wed, 22 Feb 2023 23:56:21 +0000 (17:56 -0600)]
LU-16992 kfilnd: Expand debugfs to record min/max

This will be useful in determining how long kfilnd transactions take
when underlying NIC gets congested.

Test-Parameters: trivial
HPE-bug-id: LUS-11497
Signed-off-by: Ian Ziemba <ian.ziemba@hpe.com>
Change-Id: I5506329086e6284e04ec7b609485d582a35ca0b5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51784
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Ron Gredvig <ron.gredvig@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-16991 kfilnd: Init LNet NI data in dev alloc 83/51783/2
Ian Ziemba [Wed, 22 Feb 2023 22:00:43 +0000 (16:00 -0600)]
LU-16991 kfilnd: Init LNet NI data in dev alloc

LNet ni_nid was being set outside of kfilnd_dev_alloc(). This was
causing the incorrect debugfs directories to be generated inside
kfilnd_dev_alloc().

Fix this by setting all LNet NI fields inside kfilnd_dev_alloc().

Test-Parameters: trivial
HPE-bug-id: LUS-11496
Signed-off-by: Ian Ziemba <ian.ziemba@hpe.com>
Change-Id: I4eecfa05966cb7793a01b92b0bc49ffca252976e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51783
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Ron Gredvig <ron.gredvig@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-16990 kfilnd: Use NETWORK_TIMEOUT for TAG_RX_CANCEL 82/51782/2
Chris Horn [Thu, 9 Mar 2023 00:18:41 +0000 (18:18 -0600)]
LU-16990 kfilnd: Use NETWORK_TIMEOUT for TAG_RX_CANCEL

We can get ECANCELED for some tagged receives which results in
transaction failure with TN_EVENT_TAG_RX_CANCEL. This can occur due
to problems with either the source or the target, so we should
use NETWORK_TIMEOUT message status.

Test-Parameters: trivial
HPE-bug-id: LUS-11520
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Ic3c1910f8a8c43447cbbc28129e23350e726830d
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51782
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Ron Gredvig <ron.gredvig@hpe.com>
Reviewed-by: Ian Ziemba <ian.ziemba@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-16989 kfilnd: Handle TX_FAIL in WAIT_SEND_COMP 81/51781/3
Chris Horn [Mon, 12 Dec 2022 23:28:54 +0000 (16:28 -0700)]
LU-16989 kfilnd: Handle TX_FAIL in WAIT_SEND_COMP

It is possible for us to get a TN_EVENT_TX_FAIL while transaction is
in TN_STATE_WAIT_SEND_COMP state. We should gracefully handle this
situation rather than LBUG.

Test-Parameters: trivial
HPE-bug-id: LUS-11344
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Ib6fc5ed41f12762843fe9f638ffd523699936556
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51781
Tested-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-16393 o2iblnd: add IBLND_REJECT_EARLY reject reason 51/51651/3
Serguei Smirnov [Thu, 13 Jul 2023 00:29:56 +0000 (17:29 -0700)]
LU-16393 o2iblnd: add IBLND_REJECT_EARLY reject reason

Add IBLND_REJECT_EARLY reason for rejecting connection request:
to be used when the device doesn't have any nets added yet or
when there's no active NIs on the net to handle the connection.
These conditions are supposed to occur only when LNI is being
added/initialized, so report at CNETERROR level vs. CERROR.

In lnet, set NI state to ACTIVE only after it has been added
to the list of NIs for the net, so that LND can know that
the NI can be used to accept connections.

Test-parameters: trivial
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: I59efb2fdf5d5ceabb6ff23f638ec85da82d57b99
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51651
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-17021 socklnd: fix late ksnr_max_conns set 90/51890/4
Cyril Bordage [Tue, 8 Aug 2023 13:06:23 +0000 (15:06 +0200)]
LU-17021 socklnd: fix late ksnr_max_conns set

ksnr_max_conns was set to the correct value after it was used.

Test-Parameters: trivial
Fixes: a5cbe7883db6 ("LU-12815 socklnd: allow dynamic setting of conns_per_peer")
Signed-off-by: Cyril Bordage <cbordage@whamcloud.com>
Change-Id: I9f2454d915ee1ab27db96f5247028db94965a11f
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51890
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
8 months agoLU-17010 lfsck: don't create trans in dryrun mode 49/51849/3
Hongchao Zhang [Sat, 22 Jul 2023 08:29:57 +0000 (16:29 +0800)]
LU-17010 lfsck: don't create trans in dryrun mode

In LFSCK, the LFSCK transaction should not be created in
dryrun mode, which is related to the following patch,

Fixes: 0c1ae1cb9c19 ("LU-13124 scrub: check for multiple linked file")
Change-Id: Id543bc3c0e300c1cc14b670d724ebcacac3bf71b
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51849
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
8 months agoLU-17000 lnet: fix use-after-free in lnet_startup_lndnet 06/51806/2
Timothy Day [Sat, 29 Jul 2023 19:58:47 +0000 (19:58 +0000)]
LU-17000 lnet: fix use-after-free in lnet_startup_lndnet

If the lnd_startup function returns a positive
error code, the ni will get freed. But the code
incorrectly checks only for negative error codes,
leading to a potential use-after-free.

Addresses-Coverity-ID: 397786 ("Use after free")

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I36dd4dbfc0b409de010257e5d9ae9d983fd1817f
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51806
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-16974 utils: lfs mirror resync to show progress 50/51750/12
Alex Zhuravlev [Wed, 2 Aug 2023 11:46:28 +0000 (14:46 +0300)]
LU-16974 utils: lfs mirror resync to show progress

lfs mirror resync should be able to:
 - show progress like lfs mirror extend --stats does
 - throttle like lfs mirror extend -W does

use 64MB buffer for mirror resync by default.

Change-Id: Ibe60748542ff4a3731aa6a4a9907be82427a0ae9
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51750
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-8191 lnet: convert functions in utils to static 27/51427/3
Timothy Day [Fri, 23 Jun 2023 20:44:17 +0000 (20:44 +0000)]
LU-8191 lnet: convert functions in utils to static

Static analysis shows that a number of functions
could be made static. This patch declares several
functions in various LNet utils and lnetconfig to
static.

In LNet selftest (lst), one unused function was
removed entirely. Some declarations were moved to
made static.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Ia4528281b3c87d77e46abb95f47ab0bdc72168c0
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51427
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-16883 ldiskfs: update for ext4-delayed-iput for SUSE 15 52/51252/3
Shaun Tancheff [Thu, 8 Jun 2023 06:32:17 +0000 (13:32 +0700)]
LU-16883 ldiskfs: update for ext4-delayed-iput for SUSE 15

ext4-delayed-iput patch does not apply cleanly to SUSE 15
SP4 and SP5 series 5.14.21 kernel.

Adjust the minor conflict in ext4_put_super()

Test-Parameters: trivial
Fixes: 616fa9b581 ("LU-15404 ldiskfs: use per-filesystem workqueues to avoid deadlocks")
HPE-bug-id: LUS-11661
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Iee424bd6d455853d9f82e6e5b08e4ab44deb432c
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51252
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-16862 rpm: set kmod-lustre-tests requires kmod-lustre explicitly 91/51191/6
Xinliang Liu [Thu, 1 Jun 2023 10:00:04 +0000 (10:00 +0000)]
LU-16862 rpm: set kmod-lustre-tests requires kmod-lustre explicitly

Kmod-lustre-tests rpm should be installed along with kmod-lustre rpm
if there is one.

Test-Parameters: trivial
Change-Id: Ib265298381c317a03c4244f8ea380c6d64f0aef5
Signed-off-by: Xinliang Liu <xinliang.liu@linaro.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51191
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-9859 libcfs: remove cfs_size_round() 09/51009/4
James Simmons [Tue, 16 May 2023 13:59:15 +0000 (09:59 -0400)]
LU-9859 libcfs: remove cfs_size_round()

Now that everyone is moved to round_up() we can safely remove the
macro cfs_size_round().

Test-Parameters: trivial
Change-Id: If8e1aff5e89007eeb38c5810c68282d51e37f019
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51009
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>