Whamcloud - gitweb
fs/lustre-release.git
17 months agoLU-16243 tests: Specify ping source in test_218 91/48891/2
Chris Horn [Wed, 5 Oct 2022 19:44:10 +0000 (13:44 -0600)]
LU-16243 tests: Specify ping source in test_218

In sanity-lnet test_218 we want to drop all traffic from "nid1" to
itself. Use the --source argument to lnetctl ping command to ensure
that the ping is not sent from nid2 to nid1.

HPE-bug-id: LUS-11275
Test-Parameters: trivial
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Ib82f182e1da5af303d4763d090d868196eb0ad70
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48891
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
17 months agoLU-16215 kfilnd: Use immediate for routed GETs 87/48787/3
Chris Horn [Mon, 3 Oct 2022 20:46:55 +0000 (14:46 -0600)]
LU-16215 kfilnd: Use immediate for routed GETs

struct lnet_msg::msg_md is NULL on routed GETs (or GETs being sent
to a router). As such we need to use immediate sends for these.

HPE-bug-id: LUS-11268
Test-Parameters: trivial
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I69db5ec36a04b2a2a78d3e1a1b506eefbe8c6484
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48787
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Ron Gredvig <ron.gredvig@hpe.com>
Reviewed-by: Ian Ziemba <ian.ziemba@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
17 months agoLU-16207 build: add rpm-build BuildRequires for SLES15 SP3 60/48760/3
Jian Yu [Tue, 4 Oct 2022 16:24:36 +0000 (09:24 -0700)]
LU-16207 build: add rpm-build BuildRequires for SLES15 SP3

SLES15 SP3 fails to build using rpm-build-4.14.1-29.46
from the main O/S repository with error message:

- Dependency tokens must begin with alpha-numeric,
  '_' or '/': BuildRequires: %kernel_module_package_buildreqs

Updating rpm-build to 4.14.3-150300.46.1 or higher
resolved the build issue.

Test-Parameters: trivial clientdistro=sles15sp3 \
testlist=sanity

Change-Id: I80099e7ba2d98e07b9877183879766f3dd7f3c1a
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48760
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Colin Faber <cfaber@ddn.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
17 months agoLU-16111 build: Fix include of stddef.h 67/48367/3
Shaun Tancheff [Sun, 28 Aug 2022 15:13:06 +0000 (22:13 +0700)]
LU-16111 build: Fix include of stddef.h

In kernel builds include the linux/stddef.h

Test-Parameters: trivial
HPE-bug-id: LUS-11185
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I0db81e01fadd01445515f96b3d04a2ec51f43044
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48367
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
17 months agoLU-8151 obd: Show correct shadow mountpoints for server 31/47131/14
Arshad Hussain [Tue, 3 May 2022 08:27:09 +0000 (04:27 -0400)]
LU-8151 obd: Show correct shadow mountpoints for server

server_fill_super_common() preps the server for mounting
and forces "Read only" (SB_RDONLY) flag to restrict IO on
the server. This when running the mount command reflects
FS always as "ro" although they are "rw"

This patch double checks the obd statfs (FS) state for
"read only" flag (OS_STATFS_READONLY) and if not found
to be really "read only" toggles (removes) SB_RDONLY flag.

The client output remains unchanged.

Output before patch:
/dev/.../mds1_flakey on /mnt/lustre-mds1 type lustre (ro,svname=...)
/dev/.../ost1_flakey on /mnt/lustre-ost1 type lustre (ro,svname=...)

Output after patch:
/dev/.../mds1_flakey on /mnt/lustre-mds1 type lustre (rw,svname=...)
/dev/.../ost1_flakey on /mnt/lustre-ost1 type lustre (rw,svname=...)

Test case conf-sanity/113 added.

Test-Parameters: trivial fstype=zfs testlist=conf-sanity
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: Ie92a686ae97dd62885f415b453bad6bdc0ed3d28
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/47131
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
17 months agoLU-15011 tests: additional checks for pool spilling 74/45074/15
Alex Zhuravlev [Tue, 28 Sep 2021 17:04:36 +0000 (20:04 +0300)]
LU-15011 tests: additional checks for pool spilling

check that spilling is not used when it's enough space

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: If01bf0d03b73c5c985af5a784096356e723cbf0c
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/45074
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexandre Ioffe <aioffe@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
17 months agoLU-8837 mgc: move server-only code out of mgc_request.c 95/41995/10
Mr NeilBrown [Wed, 26 Oct 2022 18:53:29 +0000 (14:53 -0400)]
LU-8837 mgc: move server-only code out of mgc_request.c

Create a new mgc_request_server.c to contain all the server-only code
from mgc_request.c.
Among other changes, this involves splitting
  mgc_process_recover_nodemap_log()
into two separate functions:
 mgc_process_recovery_log() for cld_is_recover() case
 mgc_process_nodemap_log() for cld_is_nodmap() case

This does add some code duplication, but removes a lot of repetitive
case checking.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I9b3795d6c8ea2c812b98a3388d687af1d7732e0a
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/41995
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
17 months agoLU-8837 lustre: make ldlm and target file lists 67/41767/11
Mr NeilBrown [Thu, 27 Oct 2022 19:15:00 +0000 (15:15 -0400)]
LU-8837 lustre: make ldlm and target file lists

Instead of listing files for ldlm and target in ptlrpc,
list them in Makefile.in in the respective directories.

This requires that Makefile.am be moved to autoMakefile.am to preserve
MOSTLYCLEANFILES.

This simplifies the makefiles in preparation for changes in what is
included on the client.

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I9f937e9460f1fe2ef436f8f7ace8999dd510885e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/41767
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
17 months agoLU-13135 quota: improve checks in OSDs to ignore quota 32/37232/16
Alex Zhuravlev [Tue, 14 Jan 2020 19:38:51 +0000 (22:38 +0300)]
LU-13135  quota: improve checks in OSDs to ignore quota

for root-owned files.

sanity/60a:
  zfs before 80s, after 66s
  ldiskfs before 65s, after 38s

ave.write declaration in sanity/60a:
  zfs before 3.21 usec, after 1.16 usec
  ldiskfs before 4.06 usec, after 0.66 usec

Change-Id: Ib9ba50d260eac408f1f5e43c4d722ff5024135cf
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/37232
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Wang Shilong <wangshilong1991@gmail.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
17 months agoLU-10391 lnet: fix build issue when IPv6 is disabled. 90/48990/2
James Simmons [Mon, 31 Oct 2022 16:43:43 +0000 (12:43 -0400)]
LU-10391 lnet: fix build issue when IPv6 is disabled.

struct inet6_dev and struct inet6_ifaddr are not defined if IPv6
is not configured for the Linux kernel.

Test-Parameters: trivial
Fixes: 781499eee64 ("LU-10391 lnet: support IPv6 in lnet_inet_enumerate()")
Change-Id: I8b16ad7bea1394c4560130190023590213ff2ded
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48990
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
17 months agoLU-15847 target: report multiple transno to debug log 58/48958/2
Mikhail Pershin [Wed, 26 Oct 2022 08:17:11 +0000 (11:17 +0300)]
LU-15847 target: report multiple transno to debug log

Don't report multiple transaction cases to console but
make it as debug message.

Fixes: 4e2e8fd2fc0a ("LU-15847 tgt: reply always with the latest assigned transno")
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: If9b47dfedcaf67487954189e8a75d2029a502469
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48958
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
17 months agoLU-16203 llog: skip bad records in llog 76/48776/6
Mikhail Pershin [Mon, 3 Oct 2022 15:35:25 +0000 (18:35 +0300)]
LU-16203 llog: skip bad records in llog

This patch is further development of idea to skip bad
(corrupted) llogs data. If llog has fixed-size records
then it is possible to skip one record but not rest of
llog block.

Patch also fixes the skipping to the next chunk:
 - make sure to skip to the next block for partial chunk
   or it causes the same block re-read.
 - handle index == 0 as goal for the llog_next_block() as
   expected exclusion and just return requested block
 - set new index after block was skipped to the first one
   in block
 - don't create fake padding record in llog_osd_next_block()
   as the caller can handle it and would know about
 - restore test_8 functionality to check corruption handling

Fixes: ec4194e4e78c ("LU-11591 llog: add synchronization for the last record")
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I6f88269e8626269268352f8bfd6d7950de438f3a
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48776
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
17 months agoLU-16152 utils: fix integer overflow in cYAML parser 49/48649/7
Lei Feng [Mon, 26 Sep 2022 06:31:57 +0000 (14:31 +0800)]
LU-16152 utils: fix integer overflow in cYAML parser

Convert double to int64 correctly in cYAML parser.

Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: trivial testlist=sanity env=ONLY=65p
Change-Id: Ia3fd515c76ebfe6e5181301f53c702ef82056eba
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48649
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
17 months agoLU-10391 lnet: allow ping packet to contain large nids 28/44628/20
Mr NeilBrown [Thu, 27 Oct 2022 13:58:02 +0000 (09:58 -0400)]
LU-10391 lnet: allow ping packet to contain large nids

The ping packet has an array of fixed-size status entries that only
have room for a 4-byte-address nid.

This patches adds a feature flag which activates a list of variable
sized entries after the initial array.

Each entry contains a 4-byte status and then a nid, rounded to a
multiple of 4 bytes.  The total number of bytes of the ping_info
(header, first array, subsequent list) is stored in the ns_unused
field of the first entry in the array.

The user-space interfaces only see the initial array.

Test-Parameters: trivial testlist=sanity-lnet
Test-Parameters: serverversion=2.12 serverdistro=el7.9 testlist=runtests
Test-Parameters: clientversion=2.12 testlist=runtests
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I774641d8cda24251337ce2d055caf05a14a9e088
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/44628
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
17 months agoLU-15947 obdclass: improve precision of wakeups for mod_rpcs 41/44041/11
Mr NeilBrown [Mon, 21 Jun 2021 03:25:42 +0000 (13:25 +1000)]
LU-15947 obdclass: improve precision of wakeups for mod_rpcs

There is a limit of the number of in-flight mod rpcs with a
complication that a 'close' rpc is always permitted if there are no
other close rpcs in flight, even if that would exceed the limit.

When a non-close-request complete, we just wake the first waiting
request and assume it will use the slot we released.  When a
close-request completes, the first waiting request may not find a slot
if the close was using the 'extra' slot.  So in that case we wake all
waiting requests and let them fit it out.  This is wasteful and
unfair.

To correct this we revise the wait/wake approach to use a dedicated
wakeup function which atomically checks if a given task can proceed,
and updates the counters when permission to proceed is given.  This
means that once a task has been woken, it has already been accounted
and it can proceed.

To minimise locking, cl_mod_rpcs_lock is discarded and
cl_mod_rpcs_waitq.lock is used to protect the counters.  For the
fast-path where the max has not been reached, this means we take and
release that spinlock just once.  We call wake_up_locked while still
holding the lock, and if that woke the process, then we don't drop the
spinlock to wait, but proceed directly to the remainder of the task.

When the last 'close' rpc completes, the wake function will iterate
the whole wait queue until it finds a task waiting to submit a close
request.  When any other rpc completes, the queue will only be
searched until the maximum is reached.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Iff094c3188a3bd8a04edc1d5d98ec3014e2b059b
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/44041
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
17 months agoLU-10003 lnet: use Netlink to support old and new NI APIs. 14/48814/13
James Simmons [Fri, 28 Oct 2022 18:02:17 +0000 (14:02 -0400)]
LU-10003 lnet: use Netlink to support old and new NI APIs.

The LNet layer uses two different sets of ioctls. One ioctl set is
for Multi-Rail and the other is an older API. Both are in heavy
use and with the upcoming support for IPv6 we are looking at an
explosion of ioctls. The solution is to move the LNet layer to
Netlink which can easily handle all the differences between the
APIs. This also resolves a long standing issue of the user land
API constantly changing in a non-compatible way with previous
versions.

This patch unifies the handling the LNet NI to use Netlink and is
fully aware of the new large NID addressing.

Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: James Simmons <jsimmons@infradead.org>
Change-Id: Idf3384fe7cd0f593f149fd5d8f3a101e8bd8a7f6
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48814
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
17 months agoLU-15117 ofd: no lock for dt_bufs_get() in read path 09/48209/5
Alex Zhuravlev [Fri, 12 Aug 2022 11:24:17 +0000 (14:24 +0300)]
LU-15117 ofd: no lock for dt_bufs_get() in read path

osd_bufs_get() allocates the pages and can cause new transactions
as part of memory release procedure. this would break Lustre's
"start a transaction, then do locking" rule.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I782f0cc6c96251ad88d5fb8d15c9ac91d382bf7e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48209
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
17 months agoLU-16222 kernel: RHEL 8.7 client and server support 79/48879/4
Jian Yu [Wed, 26 Oct 2022 01:33:53 +0000 (18:33 -0700)]
LU-16222 kernel: RHEL 8.7 client and server support

This patch makes changes to support RHEL 8.7 release
with kernel 4.18.0-423.el8 for Lustre client and server.

Test-Parameters: trivial fstype=ldiskfs \
clientdistro=el8.7 serverdistro=el8.7 testlist=sanity

Test-Parameters: trivial fstype=zfs \
clientdistro=el8.7 serverdistro=el8.7 testlist=sanity

Change-Id: Ie97ff67c9a5fbd46bc145ab559665dcbc630b4a0
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Co-Authored-By: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48879
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
17 months agoLU-16256 tests: sanity-flr to remove temp files 22/48922/5
Alex Zhuravlev [Tue, 18 Oct 2022 12:41:53 +0000 (15:41 +0300)]
LU-16256 tests: sanity-flr to remove temp files

this let the test run on a smaller devices.

Test-Parameters: trivial testlist=sanity-flr
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I800bcb7a8d847e1f0d44f344c3810ef298c8507a
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48922
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
17 months agoLU-15564 osd: add allocation time histogram 50/46550/15
Alex Zhuravlev [Fri, 18 Feb 2022 08:39:12 +0000 (11:39 +0300)]
LU-15564 osd: add allocation time histogram

add block mapping/allocation histogram to brw stats to debug
mballoc related issues.

$ lctl get_param osd*.*OST*.brw_stats
                           read      |     write
block maps msec        maps  % cum % |  maps        % cum %
1:    1522360 100 100   | 49272  99  99
2:          0   0 100   |    1   0  99
4:          0   0 100   |    1   0  99
8:          0   0 100   |    0   0  99
16:          0   0 100   |    0   0  99
32:          0   0 100   |    0   0  99
64:          0   0 100   |    1   0 100

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I1185386adc64e844de71e25a4e439e493e5e5bc5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/46550
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
17 months agoLU-16252 tests: test_129() fix for not combined_mgs_mds 18/48918/4
Elena Gryaznova [Wed, 19 Oct 2022 11:58:53 +0000 (14:58 +0300)]
LU-16252 tests: test_129() fix for not combined_mgs_mds

To reproduce the failure, just run test_129 on not
combined_mgs_mds setup:
  ONLY=129 sh conf-sanity.sh
    Start of /dev/vdb on ost1 failed 110
    conf-sanity test_129: @@@@@@ FAIL: start ost1 failed

Fixes: cefabee525 ("LU-15112 mgc: do not ignore target registration failure")
Test-Parameters: trivial testlist=conf-sanity env=ONLY=129 combinedmdsmgs=false
Test-Parameters: testlist=conf-sanity env=ONLY=129 combinedmdsmgs=true
HPE-bug-id: LUS-10708
Signed-off-by: Elena Gryaznova <elena.gryaznova@hpe.com>
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Change-Id: If8317feb40345d057a0e38dfb4ff95448953f6ff
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48918
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexander <alexander.boyko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
17 months agoLU-16241 obdclass: NL_SET_ERR_MSG work for older kernel 16/48916/6
Arshad Hussain [Wed, 19 Oct 2022 04:29:42 +0000 (09:59 +0530)]
LU-16241 obdclass: NL_SET_ERR_MSG work for older kernel

NL_SET_ERR_MSG macros is already defined in kernels
3.10.0-1160 and above. For older kernels (3.10.0-957)
where this is not defined we put the message to the
system log as a workaround

Test-Parameters: trivial clientdistro=el7.9 serverdistro=el7.9
Test-Parameters: trivial clientdistro=el8.5 serverdistro=el8.5
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I6830e1dd2ca84df09ef89aaaa9e9b802d9cdbd16
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48916
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
17 months agoLU-16249 sec: krb5_decrypt_bulk calls decryption primitive 07/48907/2
Sebastien Buisson [Tue, 18 Oct 2022 15:19:01 +0000 (17:19 +0200)]
LU-16249 sec: krb5_decrypt_bulk calls decryption primitive

krb5_decrypt_bulk() was mistakenly calling an encryption primitive
instead of a decryption primitive for the confounder.

Test-Parameters: trivial
Fixes: 0a65279121 ("LU-13344 gss: Update crypto to use sync_skcipher")
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I9251172644ed6baa3bb06a59dbe7c1bab401d817
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48907
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
17 months agoLU-16232 scripts: changelog/updatelog emergency cleanup 38/48838/4
Mikhail Pershin [Wed, 12 Oct 2022 09:22:14 +0000 (12:22 +0300)]
LU-16232 scripts: changelog/updatelog emergency cleanup

Emergency cleanup scripts for situations when llogs are
corrupted and can't be cleaned up in a normal way. In such
cases the recommendation is to remove/truncate those llogs.

Scripts make all needed steps and have debugging option to
collect llogs for further analysis.

Scripts possible actions are:
 - dry-run mode to check all actions and files affected
 - create archive with all llogs for analysis
 - remove llogs including all plain llogs

Test-Parameters: trivial
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I3b197179bc54f451e3c5d7db36b6f1c56c076856
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48838
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
17 months agoLU-16172 o2iblnd: add verbose debug prints for rx/tx events 00/48600/6
Serguei Smirnov [Fri, 20 May 2022 00:20:14 +0000 (17:20 -0700)]
LU-16172 o2iblnd: add verbose debug prints for rx/tx events

Added/modified debug messages for syncing with mlnx driver
debug output. On rx/tx events print message type, size and
peer credits. Make printing of debug message on o2iblnd conn
refcount change events compile-time optional. Add compile-time
option for dumping detailed connection state info to net debug log.

Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: If7c2de56d8e4ef71085c3b49caf589e2f3864b15
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48600
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
17 months agoLU-16258 llite: Explicitly support .splice_write 28/48928/2
Shaun Tancheff [Fri, 21 Oct 2022 04:54:49 +0000 (23:54 -0500)]
LU-16258 llite: Explicitly support .splice_write

Linux commit v5.9-rc1-6-g36e2c7421f02
  fs: don't allow splice read/write without explicit ops

Lustre supports splice_write and previously provide handlers
for splice_read.
Explicitly use iter_file_splice_write, if it exists.

HPE-bug-id: LUS-11259
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I858688fc9b4dd370b6018c3b134f01e580477b25
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48928
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
17 months agoLU-15852 lnet: Don't modify uptodate peer with temp NI 22/47322/3
Chris Horn [Wed, 30 Mar 2022 18:35:23 +0000 (13:35 -0500)]
LU-15852 lnet: Don't modify uptodate peer with temp NI

When processing the config log it is possible that we attempt to
add temp NIs after discovery has completed on a peer. These temp
may not actually exist on the peer. Since discovery has already
completed the peer is considered up-to-date and we can end up with
incorrect peer entries. We shouldn't add temp NIs to a peer that
is already up-to-date.

HPE-bug-id: LUS-10867
Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Ia484713b1e6c9e1a46e525589b7c741c6478e417
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/47322
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
17 months agoLU-15139 osp: block reads until the object is created 03/47003/24
Alex Zhuravlev [Wed, 6 Apr 2022 08:00:30 +0000 (11:00 +0300)]
LU-15139 osp: block reads until the object is created

it's possible that remote llog can be read and written simultaneously
at recovery. for example, dtx recovery thread is fetching updates
while MDD's orphan cleanup procedure is removing orphans from PENDING.

OSP can be asked to read a just created in OSP cache object while
actual object on remote MDS hasn't been created yet. OSP should
block such reads until the creation is done.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Id0f52b90761839399102bed825569da6bfd17864
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/47003
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
17 months agoLU-15626 tests: Fix "error" reported by shellcheck for replay-dual 35/46835/4
Arshad Hussain [Wed, 16 Mar 2022 08:24:32 +0000 (13:54 +0530)]
LU-15626 tests: Fix "error" reported by shellcheck for replay-dual

This patch fixes "error" issues reported by shellcheck
for file lustre/tests/replay-dual.sh. This patch also
moves spaces to tabs.

Test-Parameters: trivial testlist=replay-dual
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: Ie195dd39dd4789be660115b360b5b8bf6ebc1a57
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/46835
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
17 months agoLU-15626 tests: Fix "error" reported by shellcheck 11/46811/5
Arshad Hussain [Sat, 12 Mar 2022 06:14:20 +0000 (11:44 +0530)]
LU-15626 tests: Fix "error" reported by shellcheck

This patch fixes "error" issues reported by shellcheck
for *.sh files. These files had only single error
reported by shellcheck. The change in these files are
init_test_env $@ (->to->) init_test_env "$@"

Test-Parameters: trivial
Test-Parameters: testlist=dom-performance,scrub-performance
Test-Parameters: testlist=replay-single,replay-ost-single,replay-vbr
Test-Parameters: testlist=sanity-pcc,sanity-pfl,sanity-selinux
Test-Parameters: testlist=sanity-benchmark,parallel-scale
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I21fc2f25eb67d724b9e30c586568d2501648a80a
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/46811
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sarah Liu <sarah@whamcloud.com>
17 months agoLU-15619 osc: Remove oap lock 19/46719/7
Patrick Farrell [Fri, 4 Mar 2022 22:08:44 +0000 (17:08 -0500)]
LU-15619 osc: Remove oap lock

The OAP lock is taken around setting the oap flags, but not
any of the other fields in oap.  As far as I can tell, this
is just some cargo cult belief about locking - there's no
reason for it.

Remove it entirely.  (From the code, a queued spin lock
appears to be 12 bytes on x86_64.)

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ib61190d52c08d88c95a0c19b8ef7d114e26cfae2
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/46719
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Zhenyu Xu <bobijam@hotmail.com>
17 months agoLU-15046 osp: precreate thread vs connect race 99/45099/24
Alex Zhuravlev [Thu, 30 Sep 2021 12:16:57 +0000 (15:16 +0300)]
LU-15046 osp: precreate thread vs connect race

lcs_exp (required for fid client) was initialized in osp_obd_connect()
which races with osp_precreate_thread(). the latter can get stuck if
lcs_exp is not initialized and then the whole precreation logic is
blocked until remount. instead the precreation thread can just wait
preliminary until lcs_exp is initialized properly.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I7a42bf4b17ce5d46bc25bd548d81eb55f168804b
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/45099
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
17 months agoLU-6142 obdclass: make ccc_users in cl_client_cache a refcount_t 81/48881/2
Mr. NeilBrown [Fri, 7 Oct 2022 13:53:38 +0000 (09:53 -0400)]
LU-6142 obdclass: make ccc_users in cl_client_cache a refcount_t

As this is used as a refcount, it should be declared
as one.

Change-Id: I5af513ccb2b706a398e647ce0427affa4516a9b5
Signed-off-by: Mr. NeilBrown <neilb@suse.de>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48881
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
17 months agoLU-16160 llite: clear stale page's uptodate bit 07/48607/18
Bobi Jam [Tue, 20 Sep 2022 16:27:04 +0000 (00:27 +0800)]
LU-16160 llite: clear stale page's uptodate bit

With truncate_inode_page()->do_invalidatepage()->ll_invalidatepage()
call path before deleting vmpage from page cache, the page could be
possibly picked up by ll_read_ahead_page()->grab_cache_page_nowait().

If ll_invalidatepage()->cl_page_delete() does not clear the vmpage's
uptodate bit, the read ahead could pick it up and think it's already
uptodate wrongly.

In ll_fault()->vvp_io_fault_start()->vvp_io_kernel_fault(), the
filemap_fault() will call ll_readpage() to read vmpage and wait for
the unlock of the vmpage, and when ll_readpage() successfully read
the vmpage then unlock the vmpage, memory pressure or truncate can
get in and delete the cl_page, afterward filemap_fault() find that
the vmpage is not uptodate and VM_FAULT_SIGBUS got returned. To fix
this situation, this patch makes vvp_io_kernel_fault() restart
filemap_fault() to get uptodated vmpage again.

Test-Parameters: testlist=sanityn env=ONLY="16f",ONLY_REPEAT=50
Test-Parameters: testlist=sanityn env=ONLY="16g",ONLY_REPEAT=50
Test-Parameters: testlist=sanityn env=ONLY="16f 16g",ONLY_REPEAT=50
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I369e1362ffb071ec0a4de3cd5bad27a87cff5e05
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48607
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
17 months agoLU-15935 target: keep track of multirpc slots in last_rcvd 82/48082/11
Etienne AUJAMES [Fri, 29 Jul 2022 12:35:33 +0000 (14:35 +0200)]
LU-15935 target: keep track of multirpc slots in last_rcvd

OBD_INCOMPAT_MULTI_RPCS is cleared by tgt_boot_epoch_update() if the
recovery is aborted. This supposes that all the clients are evicted
but that is not true. Some clients could have successfully finished
their recovery. In that case, those clients will keep their last_rcvd
slot.

This patch modifies lut_num_client to keep track of multirpc
slots in last_rcvd.
For now the counter is use only by tgt_fini() to clear
OBD_INCOMPAT_MULTI_RPCS. So we can expand this use case for
tgt_boot_epoch_update().

Add replay-dual test_33.

Test-Parameters: testlist=replay-dual env=ONLY=33,ONLY_REPEAT=30
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Change-Id: I70791c9dcb7cc77f018b9e5c95568598d54f0322
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48082
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
17 months agoRevert "LU-16046 revert: "LU-9964 llite: prevent mulitple group locks""
Oleg Drokin [Tue, 1 Nov 2022 18:39:15 +0000 (14:39 -0400)]
Revert "LU-16046 revert: "LU-9964 llite: prevent mulitple group locks""

This reverts commit bc37f89a81ea0a2fae8668e21247552e8894bfd8.

unreverting the revert since the fix that replaced it was bad and
ther are better ideas on how to amend this fix now rather than
full-on revert

Change-Id: I1ef28c13715e7ea98021e1f83331e5533c2a8868
Signed-off-by: Oleg Drokin <green@whamcloud.com>
17 months agoRevert "LU-16046 ldlm: group lock fix"
Oleg Drokin [Tue, 1 Nov 2022 18:38:37 +0000 (14:38 -0400)]
Revert "LU-16046 ldlm: group lock fix"

This reverts commit 3ffcb5b700ebfd68dba4daca4192fdacaf7fd541.
it introduced sleep under spinlock that was missed in testing.

Change-Id: I133e704595e97c0c62f47c23b3996871daf4c0dd
Signed-off-by: Oleg Drokin <green@whamcloud.com>
18 months agoLU-14719 lod: distributed transaction check space 39/47039/8
Lai Siyao [Wed, 30 Mar 2022 21:50:22 +0000 (17:50 -0400)]
LU-14719 lod: distributed transaction check space

Distributed transaction failure may cause file missing or disconnected
directories, to avoid failure on disk full, check remote MDT free
space before transaction start.

The block/inode watermarks in obd_statfs_info are used to check
whether MDT has enough free blocks/inodes.

Add sanity 230x.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I0922e9c8668e8b842d313576bd68b52fa5d434ac
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/47039
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
18 months agoLU-16187 tests: Fix is_project_quota_supported() 54/48654/2
Arshad Hussain [Mon, 26 Sep 2022 09:31:41 +0000 (15:01 +0530)]
LU-16187 tests: Fix is_project_quota_supported()

is_project_quota_supported() is called from sanity-quota.sh
to verify if the ldiskfs FS $ENABLE_PROJECT_QUOTAS is true
and to verify if current version of lfs command supports
'project'.  To do this it calls 'lfs --help' which is
not supported. This patch moves 'lfs --help' call to
'lfs --list-commands' call to verfiy if the present
version of lfs supports 'project'

Test-Parameters: trivial testlist=sanity-quota
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: Iba7e6696d3fa9e980088f448ae72b07a4b47f4f2
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48654
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
18 months agoLU-16240 build: Use new AS_HELP_STRING 84/48884/4
James Simmons [Tue, 4 Oct 2022 13:36:23 +0000 (07:36 -0600)]
LU-16240 build: Use new AS_HELP_STRING

Starting with autoconf 2.70 AC_HELP_STRING has been replaced with
AS_HELP_STRING. Move to this new macro.

Test-Parameters: trivial
Change-Id: I1d4f69fb844f51f05a8f46751df8b79d93db78f8
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48884
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
18 months agoLU-15305 obdclass: fix race in class_del_profile 02/48802/4
Li Dongyang [Fri, 7 Oct 2022 12:09:10 +0000 (23:09 +1100)]
LU-15305 obdclass: fix race in class_del_profile

Move profile lookup and remove from lustre_profile_list
into the same critical section, otherwise we could race with
class_del_profiles or another class_del_profile.

Do not create duplicate mount opts in the client config,
otherwise we will add duplicate lustre_profile to
lustre_profile_list for a single mount.

Change-Id: I648aa206716213b064d045f546516b219337e0ed
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48802
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
18 months agoLU-15807 ksocklnd: fix irq lock inversion while calling sk_data_ready() 15/48715/5
James Simmons [Sun, 2 Oct 2022 13:45:42 +0000 (09:45 -0400)]
LU-15807 ksocklnd: fix irq lock inversion while calling sk_data_ready()

sk->sk_data_ready() of sctp socket can be called from both BH and non-BH
contexts, but ksocklnd version of sk_data_ready, ksocknal_data_ready()
does not handle the BH case. Change how ksnd_global_lock is taken in
this case.

Test-Parameters: trivial testlist=sanity-lnet
Test-Parameters: testgroup=review-ldiskfs-arm testlist=sanity-lnet
Change-Id: I07fade0da4cdfe095edc7a17e4f65012d6f92942
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48715
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
18 months agoLU-16177 kernel: kernel update RHEL9.0 [5.14.0-70.26.1.el9_0] 76/48676/3
Jian Yu [Thu, 6 Oct 2022 19:02:23 +0000 (12:02 -0700)]
LU-16177 kernel: kernel update RHEL9.0 [5.14.0-70.26.1.el9_0]

Update RHEL9.0 kernel to 5.14.0-70.26.1.el9_0 for Lustre client.

Test-Parameters: trivial clientdistro=el9.0 \
env=SANITY_EXCEPT="130 244a" testlist=sanity

Change-Id: I9da2ccdf419d6490fdba80199eda69f4f19361be
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48676
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Colin Faber <cfaber@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
18 months agoLU-16175 kernel: kernel update SLES12 SP5 [4.12.14-122.133.1] 05/48605/2
Jian Yu [Tue, 20 Sep 2022 03:47:03 +0000 (20:47 -0700)]
LU-16175 kernel: kernel update SLES12 SP5 [4.12.14-122.133.1]

Update SLES12 SP5 kernel to 4.12.14-122.133.1 for Lustre client.

Test-Parameters: trivial clientdistro=sles12sp5 \
env=SANITY_EXCEPT="56oc 430c 817" testlist=sanity

Change-Id: I35596cdfa075a19b5b1d29bad96271cbe83491bb
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48605
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Colin Faber <cfaber@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
18 months agoLU-16174 kernel: kernel update SLES15 SP4 [5.14.21-150400.24.21.2] 04/48604/2
Jian Yu [Tue, 20 Sep 2022 03:33:30 +0000 (20:33 -0700)]
LU-16174 kernel: kernel update SLES15 SP4 [5.14.21-150400.24.21.2]

Update SLES15 SP4 kernel to 5.14.21-150400.24.21.2 for Lustre client.

Test-Parameters: trivial clientdistro=sles15sp4 \
env=SANITY_EXCEPT="27J 101j 244a" testlist=sanity

Change-Id: Ia68e1c960c79f40d0f725b0f440cd562b820a19f
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48604
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Colin Faber <cfaber@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
18 months agoLU-16173 kernel: kernel update SLES15 SP3 [5.3.18-150300.59.93.1] 01/48601/3
Jian Yu [Thu, 13 Oct 2022 01:18:15 +0000 (18:18 -0700)]
LU-16173 kernel: kernel update SLES15 SP3 [5.3.18-150300.59.93.1]

Update SLES15 SP3 kernel to 5.3.18-150300.59.93.1 for Lustre client.

Test-Parameters: trivial clientdistro=sles15sp3 \
testlist=sanity

Change-Id: I1e0afe6974567d13680dbb0d463fbbd873ef2e5f
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48601
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
18 months agoLU-16233 build: Add always target for SUSE15 SP3 LTSS 33/48833/2
Shaun Tancheff [Wed, 12 Oct 2022 06:16:21 +0000 (13:16 +0700)]
LU-16233 build: Add always target for SUSE15 SP3 LTSS

SUSE 15 SP3 LTSS kernel version 5.3.18-150300.59.93
(and later) breaks lustre build tests which expect
conftest.i to be generated.

HPE-bug-id: LUS-11286
Test-Parameters: trivial
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: If23e9b31b537878a43075ffff62a99906f47fd9a
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48833
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
18 months agoLU-15234 lnet: add mechanism for dumping lnd peer debug info 66/48566/6
Serguei Smirnov [Mon, 28 Feb 2022 19:04:00 +0000 (11:04 -0800)]
LU-15234 lnet: add mechanism for dumping lnd peer debug info

Add ability to dump lnd peer debug info:
lnetctl debug peer --nid=<nid>

The debug info is dumped to the log as D_CONSOLE by the respective
lnd and can be retrieved with "lctl dk" or seen in syslog.
This mechanism has been added for socklnd and o2iblnd peers.

Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: Ia9c4d59143206bcb7ec43806594cf0cfaed5f0a9
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48566
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
18 months agoLU-15795 lbuild: enable KABI 07/47507/5
Minh Diep [Tue, 20 Sep 2022 18:24:54 +0000 (11:24 -0700)]
LU-15795 lbuild: enable KABI

Enable build kabi and clean up kmodtool patch

Test-Parameters: trivial fstype=ldiskfs clientdistro=el8.5 serverdistro=el8.5
Test-Parameters: trivial fstype=ldiskfs clientdistro=el8.6 serverdistro=el8.6

Change-Id: I16d54af0004c4ddc1cc5e6acca81e4aa89a1a1c1
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/47507
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
18 months agoLU-12130 test: pool inheritance for mdt component 91/46391/4
Vitaly Fertman [Mon, 31 Jan 2022 15:43:14 +0000 (18:43 +0300)]
LU-12130 test: pool inheritance for mdt component

test if the pool info is inherited for the mdt component,
what is not supposed to happen

Test-Parameters: testlist=sanity env=ONLY=65o
Change-Id: I07e15fe2979c2e8887024fb959af2926425d258a
HPE-bug-id: LUS-7180
Signed-off-by: Vitaly Fertman <vitaly.fertman@hpe.com>
Signed-off-by: Vladimir Saveliev <vlaidimir.saveliev@hpe.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/46391
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Elena Gryaznova <elena.gryaznova@hpe.com>
Reviewed-by: Colin Faber <cfaber@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
18 months agoLU-15447 tests: sanity-flr/208 reset rotational status 88/46088/15
Alex Zhuravlev [Thu, 13 Jan 2022 07:27:21 +0000 (10:27 +0300)]
LU-15447 tests: sanity-flr/208 reset rotational status

new kernels (e.g. 4.18.0-305.25.1) declares loopback devices
in tmpfs as non-rotational one. sanity-flr/208 does wrong
assumption that devices are non-rotational by default. thus,
sanity-flr/208 started to fail with new kernels.

Fixes: 8507472dd37e ("LU-14996 lov: prefer mirrors on non-rotational OSTs")
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Ib5c42da39667227a6cff5d379e30d2cd6c1e2773
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/46088
Reviewed-by: Colin Faber <cfaber@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
18 months agoLU-16183 test: sanity-hsm/70 should detect python 37/48737/3
Minh Diep [Mon, 3 Oct 2022 18:22:47 +0000 (11:22 -0700)]
LU-16183 test: sanity-hsm/70 should detect python

Check for python2 and python3 explicitly, since the
generic python command does not exist in newer distros.

Test-Parameters: env=SLOW=yes,ENABLE_QUOTA=yes \
clientdistro=sles15sp3 testlist=sanity-hsm
Test-Parameters: env=SLOW=yes,ENABLE_QUOTA=yes \
clientdistro=el7.9 testlist=sanity-hsm

Change-Id: I35bbe15fd298341870ad4f1ab5976e82ccc84667
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48737
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Charlie Olmstead <charlie@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
18 months agoLU-13175 tests: sanity/803 to sync MDTs for actual statfs 46/37346/9
Alex Zhuravlev [Tue, 28 Jan 2020 23:00:59 +0000 (02:00 +0300)]
LU-13175 tests: sanity/803 to sync MDTs for actual statfs

as number of dnodes is updated at commit.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I037064419a4674fe8e269b68e41f97c0f3763332
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/37346
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Colin Faber <cfaber@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
18 months agoLU-16139 statahead: avoid to block ptlrpcd interpret context 51/48451/5
Qian Yingjin [Wed, 7 Sep 2022 08:59:19 +0000 (04:59 -0400)]
LU-16139 statahead: avoid to block ptlrpcd interpret context

If a stat-ahead entry is a striped directory or a regular file
with layout change, it will generate a new RPC and block ptlrpcd
interpret context for a long time.
However, it is dangerous of blocking in ptlrpcd thread as it may
result in deadlock.

The following is the stack trace for the timeout of replay-dual
test_26:
task:ptlrpcd_00_01   state:I stack:    0 pid: 8026 ppid:     2
osc_extent_wait+0x44d/0x560 [osc]
osc_cache_wait_range+0x2b8/0x930 [osc]
osc_io_fsync_end+0x67/0x80 [osc]
cl_io_end+0x58/0x130 [obdclass]
lov_io_end_wrapper+0xcf/0xe0 [lov]
lov_io_fsync_end+0x6f/0x1c0 [lov]
cl_io_end+0x58/0x130 [obdclass]
cl_io_loop+0xa7/0x200 [obdclass]
cl_sync_file_range+0x2c9/0x340 [lustre]
vvp_prune+0x5d/0x1e0 [lustre]
cl_object_prune+0x58/0x130 [obdclass]
lov_layout_change.isra.47+0x1ba/0x640 [lov]
lov_conf_set+0x38d/0x4e0 [lov]
cl_conf_set+0x60/0x140 [obdclass]
cl_file_inode_init+0xc8/0x380 [lustre]
ll_update_inode+0x432/0x6e0 [lustre]
ll_iget+0x227/0x320 [lustre]
ll_prep_inode+0x344/0xb60 [lustre]
ll_statahead_interpret_common.isra.26+0x69/0x830 [lustre]
ll_statahead_interpret+0x2c8/0x5b0 [lustre]
mdc_intent_getattr_async_interpret+0x14a/0x3e0 [mdc]
ptlrpc_check_set+0x5b8/0x1fe0 [ptlrpc]
ptlrpcd+0x6c6/0xa50 [ptlrpc]

In this patch, we use work queue to handle the extra RPC and long
wait in a separate thread for a striped directory and a regular
file with layout change:
(@ll_prep_inode->@lmv_revalidate_slaves);
(@ll_prep_inode->@lov_layout_change->osc_cache_wait_range)

Test-Parameters: testlist=replay-dual env=ONLY=26,ONLY_REPEAT=10 mdscount=2 mdtcount=4
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I404a320620c4ec4caa608e675ecf324fcd26f1e0
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48451
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
18 months agoLU-16149 lnet: Discovery queue and deletion race 32/48532/3
Chris Horn [Mon, 12 Sep 2022 21:09:38 +0000 (15:09 -0600)]
LU-16149 lnet: Discovery queue and deletion race

lnet_peer_deletion() can race with another thread calling
lnet_peer_queue_for_discovery.

Discovery thread:
 - Calls lnet_peer_deletion():
 - LNET_PEER_DISCOVERING bit is cleared from lnet_peer::lp_state
 - releases lnet_peer::lp_lock

Another thread:
 - Acquires lnet_net_lock/EX
 - Calls lnet_peer_queue_for_discovery()
 - Takes lnet_peer::lp_lock
 - Sets LNET_PEER_DISCOVERING bit
 - Releases lnet_peer::lp_lock
 - Sees lnet_peer::lp_dc_list is not empty, so it does not add peer
   to dc request queue
 - lnet_peer_queue_for_discovery() returns, lnet_net_lock/EX releases

Discovery thread:
 - Acquires lnet_net_lock/EX
 - Deletes peer from ln_dc_working list
 - performs the peer deletion

At this point, the peer is not on any discovery list, and it has
LNET_PEER_DISCOVERING bit set. This peer is now stranded, and any
messages on the peer's lnet_peer::lp_dc_pendq are likewise stranded.

To solve this, we modify lnet_peer_deletion() so that it waits to
clear the LNET_PEER_DISCOVERING bit until it has completed deleting
the peer and re-acquired the lnet_peer::lp_lock. This ensures we
cannot race with any other thread that may add the
LNET_PEER_DISCOVERING bit back to the peer. We also avoid deleting
the peer from the ln_dc_working list in lnet_peer_deletion(). This is
already done by lnet_peer_discovery_complete().

There is another window where the LNET_PEER_DISCOVERING bit can be
added when the discovery thread drops the lp_lock just before
acquiring the net_lock/EX and calling lnet_peer_discovery_complete().
Have lnet_peer_discovery_complete() clear LNET_PEER_DISCOVERING to
deal with this (it already does this for the case where discovery hit
an error). Also move the deletion of lp_dc_list to after we clear the
DISCOVERING bit. This is to mirror the behavior of
lnet_peer_queue_for_discovery() which sets the DISCOVERING bit and
then manipulates the lp_dc_list.

Also tweak the logic in lnet_peer_deletion() to call
lnet_peer_del_locked() in order to avoid extra calls to
lnet_net_lock()/lnet_net_unlock().

HPE-bug-id: LUS-11237
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Ifcfef1d49f216af4ddfcdaf928024e8ee3952555
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48532
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
18 months agoLU-15847 tgt: move tti_ transaction params to tsi_ 91/47491/5
Mikhail Pershin [Sat, 28 May 2022 18:16:11 +0000 (21:16 +0300)]
LU-15847 tgt: move tti_ transaction params to tsi_

Move tti_mult_trans and tti_has_trans to tgt_session_info to
be available in all targets. This allows to cleanup old MDT
duplicating code and can be used for complex transaction
handling in MDT/OFD if needed.

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I3f0c15e283b9e21c04a009f6cf346afa278e7095
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/47491
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John Hammond <jhammond@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
18 months agoLU-15847 tgt: reply always with the latest assigned transno 92/47492/3
Mikhail Pershin [Tue, 31 May 2022 10:38:25 +0000 (13:38 +0300)]
LU-15847 tgt: reply always with the latest assigned transno

In tgt_txn_stop_cb() don't skip transno assignment in case
of unexpected multiple last_rcvd updates. So the latest
transno will be reported back in reply but not the first
one.

The reporting of just the first transno might lead to data
loss at failover because partially committed operation will
be considered as fully committed and rest of operation will
not be replayed.

Proposed way with reporting the last assigned transno to
the client could cause replay failures in some cases which
is still better that possible data loss. So patch makes a
multiple transaction case less severe.

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: Ia07e89576127a2fc1eb2ae706551ffe8ceaa93be
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/47492
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
18 months agoLU-16025 llite: adjust read count as file got truncated 96/47896/21
Bobi Jam [Thu, 7 Jul 2022 07:38:54 +0000 (15:38 +0800)]
LU-16025 llite: adjust read count as file got truncated

File read will not notice the file size truncate by another node,
and continue to read 0 filled pages beyond the new file size.

This patch add a confinement in the read to prevent the issue and
add a test case verifying the fix.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Ie51ba09201a1ca1464c3a3892d367590e978ee34
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/47896
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <farr0186@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
18 months agoLU-16046 ldlm: group lock fix 38/48038/6
Vitaly Fertman [Wed, 8 Jun 2022 20:05:45 +0000 (23:05 +0300)]
LU-16046 ldlm: group lock fix

The original LU-9964 fix had a problem because with many pages in
memory grouplock unlock takes 10+ seconds just to discard them.

The current patch makes grouplock unlock asynchronous. It introduces
a logic similar to the original one, but on mdc/osc layer.

add a new test similar to sanity_244b but for DOM layout files.

HPE-bug-id: LUS-10644, LUS-10906
Signed-off-by: Vitaly Fertman <vitaly.fertman@hpe.com>
Change-Id: Ib6d6a3a41baff5b0161468abfd959f52e2a1b497
Reviewed-on: https://es-gerrit.dev.cray.com/159856
Reviewed-by: Andriy Skulysh <andriy.skulysh@hpe.com>
Reviewed-by: Alexander Boyko <c17825@cray.com>
Tested-by: Elena Gryaznova <c17455@cray.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48038
Reviewed-by: Alexander <alexander.boyko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
18 months agoLU-16046 revert: "LU-9964 llite: prevent mulitple group locks" 37/48037/5
Vitaly Fertman [Thu, 9 Jun 2022 22:00:50 +0000 (01:00 +0300)]
LU-16046 revert: "LU-9964 llite: prevent mulitple group locks"

This reverts commit aba68250a67a10104c534bd726f67b31a7f35692
since it makes group unlock synchronous what leads to poor performance
on shared file IO under group lock.

Signed-off-by: Vitaly Fertman <vitaly.fertman@hpe.com>
Change-Id: I4548986297c22e402acd051dbdf97fe58198d100
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48037
Reviewed-by: Alexander <alexander.boyko@hpe.com>
Reviewed-by: Zhenyu Xu <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
18 months agoLU-15721 llite: only statfs for projid if PROJINHERIT set 52/47352/3
Andreas Dilger [Sat, 14 May 2022 14:10:20 +0000 (08:10 -0600)]
LU-15721 llite: only statfs for projid if PROJINHERIT set

If projid is set on a directory but PROJINHERIT is not, do not report
the project quota for statfs.  This matches how ext4_statfs() and
xfs_fs_statfs() behave, on which Lustre project quota is modelled.

Fixes: e5c8f6670f ("LU-9555 quota: df should return projid-specific values")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I27cb444c3dfabc0ec693cee6fe6f9cae6db8a77a
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/47352
Reviewed-by: Wang Shilong <wangshilong1991@gmail.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
18 months agoLU-16219 tests: syntax error fix 95/48795/3
Elena Gryaznova [Thu, 6 Oct 2022 08:07:39 +0000 (11:07 +0300)]
LU-16219 tests: syntax error fix

scrub-performance:scrub_create() fix

Fixes: a20b78a81d ("LU-15357 iokit: fix the obsolete usage of cfg_device")
Test-Parameters: trivial testlist=scrub-performance
Signed-off-by: Elena Gryaznova <elena.gryaznova@hpe.com>
HPE-bug-id: LUS-11277
Change-Id: Ib6e4354c2f399019ec2d6c33f9a7d544226c0392
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48795
Reviewed-by: Vladimir Saveliev <vladimir.saveliev@hpe.com>
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
18 months agoLU-16198 tests: increase margin for sanity/33hh 13/48713/3
Andreas Dilger [Fri, 30 Sep 2022 22:25:52 +0000 (16:25 -0600)]
LU-16198 tests: increase margin for sanity/33hh

The filenames created by sanity test_33hh are randomly generated by
"mktemp" and in some rare cases a larger number of filenames may
fail the CRUSH2 hash detection for 'random' suffixes (all-numeric,
all-uppercase, all-lowercase).  This appears to be failing about
1/200 tests, but since sanity is run frequently (~1400 times/month)
there are still occasional failures reported.

Increase the maximum filename mismatch rate from 20% to 23%, which
would have avoided all of the test failures in the past 3 months.

Test-Parameters: trivial testlist=sanity mdscount=2 mdtcount=4 env=ONLY=33hh,ONLY_REPEAT=400
Fixes: 1ac4b9598a ("LU-15720 dne: add crush2 hash type")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: If63528c4281e543975454d1d84306b0dfcfc0fff
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48713
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
18 months agoLU-16200 tests: test_32[f,g]: specify blocksize explicitly 08/48708/2
Elena Gryaznova [Fri, 30 Sep 2022 15:39:13 +0000 (18:39 +0300)]
LU-16200 tests: test_32[f,g]: specify blocksize explicitly

Fix conf-sanity:test_32f(), conf-sanity:test_32g() to be
independent from BLOCKSIZE environment variable.

To reproduce the failure, just run:
   BLOCKSIZE=4096 ONLY=32g sh conf-sanity.sh
  -total 36
  -total 64
  +total 16
  +total 9
   144115205289279502 -rw-r--r-- 1 0 0  1160 1550597702 README
   conf-sanity test_32g: @@@@@@ FAIL: list verification failed

Fixes: 3c1c462399 ("LU-1943 tests: Refresh conf-sanity 32[ab]")
Test-Parameters: trivial testlist=conf-sanity
Signed-off-by: Elena Gryaznova <elena.gryaznova@hpe.com>
HPE-bug-id: LUS-11013
Reviewed-by: Vladimir Saveliev <vlaidimir.saveliev@hpe.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: Iaa07df8f5a9ba286ef5b3a5581b667cc7de63334
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48708
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexander <alexander.boyko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
18 months agoLU-16180 ptlrpc: reduce lock contention in ptlrpc_free_committed 29/48629/11
Andreas Dilger [Thu, 6 Oct 2022 17:31:51 +0000 (10:31 -0700)]
LU-16180 ptlrpc: reduce lock contention in ptlrpc_free_committed

This patch breaks out of the loop in ptlrpc_free_committed()
if need_resched() is true or there are other threads waiting
on the imp_lock. This can avoid the thread holding the
CPU for too long time to free large number of requests. The
remaining requests in the list will be processed the next
time this function is called. That also avoids delaying a
single thread too long if the list is long.

Test-Parameters: testlist=sanity clientdistro=el8.6
Test-Parameters: testlist=sanity clientdistro=ubuntu2204 env=SANITY_EXCEPT="130 244a"

Change-Id: I50f56b87844e8b019053e569767b6c949d2a3f55
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48629
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
18 months agoLU-16076 utils: enhance 'lfs check' command 55/48155/13
Lei Feng [Mon, 8 Aug 2022 02:59:25 +0000 (10:59 +0800)]
LU-16076 utils: enhance 'lfs check' command

Add optional argument to 'lfs check' command so that only the
servers related to the specified lustre file system is checked.

Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: trivial testlist=sanityn env=ONLY=113
Change-Id: I826a8e822af0a290f06ffaadadf1bb7f86899d99
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48155
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
18 months agoLU-16044 osd: discard pagecache in truncate's declaration 33/48033/14
Alex Zhuravlev [Mon, 25 Jul 2022 13:26:40 +0000 (16:26 +0300)]
LU-16044 osd: discard pagecache in truncate's declaration

to avoid taking pagelock inside a transaction which conflicts
with the write path where we take pagelock before any another one.
this should be safe as the write path writes the pages out
synchronously, so they should be clean by truncate.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: Iba555ace2ce9ef34ab5517375ecb5c176f738a02
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48033
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
18 months agoLU-15451 sec: retry ro mount if read-only flag set 90/47490/6
Sebastien Buisson [Wed, 25 May 2022 14:53:57 +0000 (16:53 +0200)]
LU-15451 sec: retry ro mount if read-only flag set

In case client mount fails with -EROFS because the read-only nodemap
flag is set and ro mount option is not specified, just retry ro mount
internally. This is to avoid the need for users to manually retry the
mount with ro option.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I0dedd1394eeb6804f7fdde930275f6649b935bab
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/47490
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
18 months agoLU-13364 utils: fix bad output for lnetctl import --show 22/43922/2
Cyril Bordage [Fri, 4 Jun 2021 03:40:07 +0000 (05:40 +0200)]
LU-13364 utils: fix bad output for lnetctl import --show

Read the right node from the yaml input ("net type" instead of "net")
to compare to what we find from ioctl when we filter results.

Signed-off-by: Cyril Bordage <cbordage@whamcloud.com>
Change-Id: I9fbbac882f26fd93299f37cca00fcbd4cb7e95d2
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/43922
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
18 months agoLU-14165 utils: llog_reader: display changleog_user records 18/40818/4
Etienne AUJAMES [Tue, 1 Dec 2020 18:10:41 +0000 (19:10 +0100)]
LU-14165 utils: llog_reader: display changleog_user records

Add a function to print changelog_user information.

llog_reader output:

01 (080)changelog user record (v2) id:0x0 cur_id:3 cur_endrec:0
cur_time:1661258371 cur_mask:0x00000003 cur_name:"toto"
...
04 (080)changelog user record (v1) id:0x0 cur_id:6 cur_endrec:0
cur_time:1661261064

Test-Parameters: trivial testlist=sanity,sanity-hsm
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Change-Id: I4e948f52a678127d70e8084e94fb89ec2677cc4b
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/40818
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
18 months agoLU-6142 obdclass: change some foo0() to __foo() 03/48803/2
Mr. NeilBrown [Fri, 7 Oct 2022 12:57:29 +0000 (08:57 -0400)]
LU-6142 obdclass: change some foo0() to __foo()

Change:
  cl_io_init0 -> __cl_io_init
  cl_lock_trace0 -> __cl_lock_trace
  cl_page_delete0 -> __cl_page_delete
  cl_page_state_set0 -> __cl_page_state_set
  cl_page_own0 -> __cl_page_own
  cl_page_disown0 -> __cl_page_disown
  cl_page_delete0 -> __cl_page_delete

This is more consistent with Linux naming style.

Test-Parameters: trivial
Change-Id: If38b52465d42ac425d47c1e9ded62bd7f013e0eb
Signed-off-by: Mr. NeilBrown <neilb@suse.de>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48803
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
18 months agoLU-10391 lnet: support IPv6 in lnet_inet_enumerate() 72/48572/2
Mr NeilBrown [Fri, 16 Sep 2022 00:57:13 +0000 (10:57 +1000)]
LU-10391 lnet: support IPv6 in lnet_inet_enumerate()

lnet_inet_enumerate() can now optionally report IPv6 addresses on
interfaces.  We use this in socklnd to determine the address of the
interface.

Unlike IPv4, different IPv6 addresses associated with a single
interface cannot be associated with different labels (e.g. eth0:2).
This means that lnet_inet_enumerate() must report the same name for
each address.  For now, we only report the first non-temporary address
to avoid any confusion.

The network mask provided with IPv4 is only use for reporting
information for an ioctl.  It isn't clear this will be useful for
IPv6, so no netmask is collected.

To save a bit of space in struct lnet_inetdev{} which much now hold a
16byte address, we replace he 4byte flag with a 1byte bool as only the
IFF_MASTER flag is ever of interest.  Another bool is needed to report
of the address is IPv6.

Test-Parameters: trivial testlist=sanity-lnet
Test-Parameters: serverversion=2.12 serverdistro=el7.9 testlist=runtests
Test-Parameters: clientversion=2.12 testlist=runtests
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I7a73033f40cc83a8993281696f17332a9101db1e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48572
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
18 months agoLU-16002 ptlrpc: reduce pinger eviction time 28/47928/10
Alexander Boyko [Fri, 16 Sep 2022 08:00:38 +0000 (04:00 -0400)]
LU-16002 ptlrpc: reduce pinger eviction time

On a server side eviction is based on PING_INTERVAL. A client
should be evicted after PING_EVICT_TIMEOUT. But eviction logic
adds additional 3 PING_INTERVAL for it. For a configuration
with obd_timeout equal to 300, addition is 225 seconds.
The second level timeout is needed when network is down for
some time. And it prevents clients evictions after first
connection.
Patch adds additional logic to check if an import is active,
and evict client faster without second level. It reduces an
eviction timeout to a PING_EVICT_TIMEOUT.

replay_dual test_0a  is based on a client eviction during recovery,
lfs df check could fail because of eviction. So complete check
similar to recovery-small.sh

Test-Parameters: testlist=recovery-small env=RECOVERY_SMALL_EXCEPT=144 serverversion=2.14
HPE-bug-id: LUS-11054
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: I4d60046ef4737f9cf95a16ac0ab63a36859b8adc
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/47928
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
18 months agoLU-16211 o2iblnd: Avoid NULL md deref 77/48777/3
Chris Horn [Mon, 3 Oct 2022 21:34:11 +0000 (15:34 -0600)]
LU-16211 o2iblnd: Avoid NULL md deref

struct lnet_msg::msg_md is NULL when a router is forwarding a
REPLY. ko2iblnd attempts to access this pointer on the receive path.
This causes a panic.

Test-Parameters: trivial
Fixes: 959304eac7 ("LU-15189 lnet: fix memory mapping.")
HPE-bug-id: LUS-11269
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I0c1dbb1e0bcd3c17b278f358755d465f7bbbb2b0
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48777
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
18 months agoLU-16199 ldiskfs: make ubuntu kernel version detection better 17/48717/2
Ake Sandgren [Mon, 3 Oct 2022 06:39:20 +0000 (08:39 +0200)]
LU-16199 ldiskfs: make ubuntu kernel version detection better

Ubuntu kernel version detection is not working correctly with
official versioning scheme.  There are also a couple of errors in the
AS_VERSION_COMPARE sequences causing problems for 5.4.0 and later.

Signed-off-by: Ake Sandgren <ake.sandgren@hpc2n.umu.se>
Change-Id: Ie6e51de95ae1513b15ee0c2baa8c421f3cb954f5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48717
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
18 months agoLU-16197 kfilnd: Convert NID num to host order 00/48700/2
Chris Horn [Mon, 26 Sep 2022 18:59:38 +0000 (12:59 -0600)]
LU-16197 kfilnd: Convert NID num to host order

The nid_num field in struct lnet_nid is stored in network byte order.
The nid_num field is used to generate the kfabric service string. The
underlying kfabric providers expect the service string to be in host
byte order not network byte order. This mismatch is preventing
multiple LNet NID indexes from being used.

Fix this by converting nid_num to host byte order.

Test-Parameters: trivial
HPE-bug-id: LUS-11254
Change-Id: I804daa6d66d775212a83e3ed013310b383b94974
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Signed-off-by: Ian Ziemba <ian.ziemba@hpe.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48700
Reviewed-by: Ron Gredvig <ron.gredvig@hpe.com>
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
18 months agoLU-16191 socklnd: limit retries on conns_per_peer mismatch 64/48664/3
Serguei Smirnov [Mon, 26 Sep 2022 23:47:24 +0000 (16:47 -0700)]
LU-16191 socklnd: limit retries on conns_per_peer mismatch

If connection initiator has a higher conns-per-peer setting than
its peer, don't try to create extra connections forever as the
peer will keep rejecting them. A few retries should suffice to
resolve a valid race.

Test-Parameters: trivial
Fixes: 71b2476e ("LU-12815 socklnd: add conns_per_peer parameter")
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: I7d04d4ac41e98a738b6c85c3d323608038f5c51e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48664
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
18 months agoLU-15791 tests: Drop local traffic during health test 61/48661/2
Chris Horn [Mon, 26 Sep 2022 15:19:19 +0000 (09:19 -0600)]
LU-15791 tests: Drop local traffic during health test

Existing drop rules for health tests omit local nids for the
destination so it is possible for local NI health values to recover
while the tests execute. Add drop rules for local NIDs to prevent
their health from recovering.

Test-Parameters: trivial
Test-Parameters: testlist=sanity-lnet env=ONLY=205,ONLY_REPEAT=100
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I6a4a06b3fa76effd21e21449abf47cd0e14bbf18
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48661
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
18 months agoLU-16051 o2iblnd: detect link state to set fatal error on ni 44/48644/3
Serguei Smirnov [Fri, 23 Sep 2022 22:20:51 +0000 (15:20 -0700)]
LU-16051 o2iblnd: detect link state to set fatal error on ni

To avoid selecting lnet ni which corresponds to a downed link
for sending, add a mechanism for detecting ip-layer link events
in o2iblnd. On ip link up/down events, find corresponding
ni and toggle ni_fatal_error_on flag. This complements the
existing mechanism for ib-layer link event handling.

Test-Parameters: trivial
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: I4720cd0a7bc577a522c7d40b54f821a4c12b670f
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48644
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
18 months agoLU-16184 o2iblnd: fix deadline for tx on peer queue 40/48640/2
Serguei Smirnov [Fri, 23 Sep 2022 19:29:59 +0000 (12:29 -0700)]
LU-16184 o2iblnd: fix deadline for tx on peer queue

In o2iblnd, deadline is checked for txs on peer queue,
but not set prior to adding the tx to the queue. This
may cause the tx to be dropped unnecessarily with
"Timed out tx for ..." warning.

Fix it by setting the tx_deadline when adding tx to peer queue.

Test-Parameters: trivial
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: Ie7cf5590b440b60f71527049953a64bb31d53578
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48640
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
18 months agoLU-15595 tests: Router test interop check and aarch fix 78/48578/9
Chris Horn [Wed, 14 Sep 2022 01:23:37 +0000 (20:23 -0500)]
LU-15595 tests: Router test interop check and aarch fix

setup_router_test() executes load_lnet() on remote nodes, but
this function was only added in 2.15. Add a version check for it.

Enabling routing may fail on nodes with small amount of memory (like
aarch config). Define small number of router buffers to work around
this issue. Modify the functions which calculate the number of buffers
to allow small sizes to be specified via parameters.

Test-Parameters: trivial testlist=sanity-lnet serverversion=2.12.9
Test-Parameters: testgroup=review-ldiskfs-arm testlist=sanity-lnet
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: If0b76747fe09e883546f18da9f3322c72263e29d
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48578
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
18 months agoLU-13641 socklnd: remove remnants of tcp bonding 68/48568/3
Mr NeilBrown [Thu, 15 Sep 2022 05:32:05 +0000 (15:32 +1000)]
LU-13641 socklnd: remove remnants of tcp bonding

->ksnp_n_passive_ips is now always zero, so remove it and all uses of
it.  ->ksnp_passive_ips is gone too, as is ksocknal_ip2iface().

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I5de6d027c545087c961673d8704f68c4f3dd5076
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48568
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
18 months agoLU-16150 zfs: Fix ZFS(2.1.99-1) build error on CentOS (3.10) 36/48536/5
Arshad Hussain [Tue, 13 Sep 2022 07:31:25 +0000 (03:31 -0400)]
LU-16150 zfs: Fix ZFS(2.1.99-1) build error on CentOS (3.10)

ZFS: (2.1.99-1)
Lustre: 27723374a38 LU-16073 utils: double snapshot_mount fix
CentOS: 3.10.0-1160.15.2.el7.x86_64

This patch fixes build failures seens as below for the
above configuration:

First:
make[4]: Entering directory `/root/lustre01/lustre-release/lustre/utils'
gcc  -rdynamic -shared -export-dynamic -pthread \
-L/root/zfs/zfs_git_lustre_build/zfs//lib/libzfs/.libs/
-L/root/zfs/zfs_git_lustre_build/zfs//lib/libnvpair/.libs/
-L/root/zfs/zfs_git_lustre_build/zfs//lib/libzpool/.libs/ -o
mount_osd_zfs.so \
`ar -t libmount_utils_zfs.a` \
-ldl   -lzfs -lnvpair -lzpool
/usr/bin/ld: cannot find -lzfs
/usr/bin/ld: cannot find -lnvpair
/usr/bin/ld: cannot find -lzpool
collect2: error: ld returned 1 exit status

Test-Parameters: trivial fstype=zfs
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I32f270c7912379f7dce940e0aa2bceee5e49ad79
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48536
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
18 months agoLU-15885 o2iblnd: fix handling of RDMA_CM_EVENT_UNREACHABLE 92/48492/2
Serguei Smirnov [Thu, 8 Sep 2022 22:27:12 +0000 (15:27 -0700)]
LU-15885 o2iblnd: fix handling of RDMA_CM_EVENT_UNREACHABLE

RDMA_CM_EVENT_UNREACHABLE may be received not only when connection
is being connected, but also when it is being closed. Fix handing
of this event accordingly.

Test-Parameters: trivial
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: I79428188c159b2d80d36326589b2977db065d4a7
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48492
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
18 months agoLU-15646 llog: correct llog FID and path output 30/48430/6
Mikhail Pershin [Sat, 3 Sep 2022 07:31:38 +0000 (10:31 +0300)]
LU-15646 llog: correct llog FID and path output

- fix wrong LLOG_ID-to-FID convertion to output llog FID by
  introducing PLOGID macro to expand llog ID for DFID format
- stop printing lgl_ogen along with llog FID as it always zero
  since 2.3.51 and is not used anymore
- output correct path for update llog in llog_reader
- always print header info in llog_reader if available
- print llog flags in header info

Fixes: 5a8e47d0a1a7 ("LU-9153 llog: update llog print format to use FIDs")
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I7ba49e8101a67d2d80c204a5fc629bfd0bce89ad
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48430
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
18 months agoLU-15738 test: check lfsck status before starting 18/48018/3
Hongchao Zhang [Fri, 22 Jul 2022 15:02:24 +0000 (23:02 +0800)]
LU-15738 test: check lfsck status before starting

If the LFSCK has been started before calling "lfsck_start"
to start it, the test shouldn't fail for starting LFSCK.

Test-Parameters: trivial testlist=sanity-lfsck
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Change-Id: I266d9e2b9c5f37eb9e08b489fab428268b90d895
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48018
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
18 months agoLU-15472 ldlm: optimize flock reprocess 57/46257/7
Andriy Skulysh [Fri, 5 Nov 2021 10:55:08 +0000 (12:55 +0200)]
LU-15472 ldlm: optimize flock reprocess

Resource reprocess on flock unlock can be done once
after all pending unlock requests.
It allows to reduce spinlock contention.

Change-Id: I2809070f27fe3af7e1fc34e2b4b22603931f3dff
HPE-bug-id: LUS-10471, LUS-10909
Signed-off-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Alexander Boyko <c17825@cray.com>
Reviewed-by: Vitaly Fertman <c17818@cray.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/46257
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Vitaly Fertman <vitaly.fertman@hpe.com>
Reviewed-by: Alexander <alexander.boyko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
18 months agoLU-10391 lnet: use %pISc for formatting IP addresses 85/48685/2
Mr NeilBrown [Wed, 28 Sep 2022 04:41:47 +0000 (14:41 +1000)]
LU-10391 lnet: use %pISc for formatting IP addresses

The Linux kernel's printf functionality understands %pIS to means that
a the address in a 'struct sockaddr' should be formated, either as
IPv4 or IPv6.  For IPv6, the verbose format showing all 16 bytes
whether zero or not is used.

To get the more familiar "compressed" format where strings of :0000:
are replaced with ::, we need to add the 'c' flag.  This is ignored
for IPv4.

When requesting the port as well ("%pISp), the 'c' and 'p' can appear
in either order.

So this patch changes all %pIS to %pISc as we always want the
compressed format.

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Ida17f5008e06a00c5460cf7161ed07de8fa7a65d
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48685
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
18 months agoLU-16138 kernel: preserve RHEL8.x server kABI for block integrity 08/48608/2
Jian Yu [Tue, 20 Sep 2022 18:19:12 +0000 (11:19 -0700)]
LU-16138 kernel: preserve RHEL8.x server kABI for block integrity

Currently there are two kernel patches supporting SCSI T10-PI feature
left in the RHEL8.x series:

- block-integrity-allow-optional-integrity-functions-rhel8.patch
- block-pass-bio-into-integrity_processing_fn-rhel8.patch

The changes in the patches modified "struct bio_integrity_payload"
and "struct blk_integrity_iter", which caused kABI breakage.

This patch fixes the patches to preserve kABI by using
RH-supplied compatibility macros.

Test-Parameters: trivial fstype=ldiskfs clientdistro=el8.5 serverdistro=el8.5
Test-Parameters: trivial fstype=ldiskfs clientdistro=el8.6 serverdistro=el8.6

Change-Id: If547e1cd4ae4ff1affd315bbfefaeeff4f1dea81
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48608
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
18 months agoLU-9680 obdclass: user netlink to collect devices information 18/31618/80
James Simmons [Sat, 17 Sep 2022 20:19:48 +0000 (16:19 -0400)]
LU-9680 obdclass: user netlink to collect devices information

Our utilities can report to users a device list with various bits
of data using the debugfs file 'devices'. This debugfs file is
only by default available to root which prevents regular users
from collecting information. Enable non-root users to collect
the same information for lctl dl using netlink. The advantage of
using netlink is that it also removes the 8K ioctl limit. Add the
ability to present this data in YAML format as well.

Change-Id: I5e6378765bd2f4c415cf29b2bc54adf0e54f308b
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/31618
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
18 months agoLU-16166 ptlrpc: lower the message level in no resend case 85/48585/2
Yang Sheng [Mon, 19 Sep 2022 05:46:27 +0000 (13:46 +0800)]
LU-16166 ptlrpc: lower the message level in no resend case

Don't report the wrong generation as a error message in
rq_no_resend case.

Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: I534cadc916fcd1eb6840439b6507e646d0e5d974
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48585
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
18 months agoLU-15943 tests: Modify timing of sanity-lnet 210 and 211 80/48580/4
Chris Horn [Wed, 14 Sep 2022 00:47:58 +0000 (19:47 -0500)]
LU-15943 tests: Modify timing of sanity-lnet 210 and 211

The portions of test_210 and test_211 that test the
max_recovery_ping_interval parameter are a little racy because the
window where we can get an accurate ping count is small. This is due
to the tests only being able to sleep for whole seconds vs the more
fine-grained time keeping done in the kernel.

Increase the max interval from 2 to 4 and adjust the expected
ping counts accordingly.

Test-Parameters: trivial
Test-Parameters: testlist=sanity-lnet env=ONLY=210,ONLY_REPEAT=100
Test-Parameters: testlist=sanity-lnet env=ONLY=211,ONLY_REPEAT=100
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Idf8b2ff0d5745bdf4484e75f452bc4f06fbcf1a4
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48580
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
18 months agoLU-16161 kernel: kernel update RHEL8.6 [4.18.0-372.26.1.el8_6] 64/48564/2
Jian Yu [Thu, 15 Sep 2022 18:43:02 +0000 (11:43 -0700)]
LU-16161 kernel: kernel update RHEL8.6 [4.18.0-372.26.1.el8_6]

Update RHEL8.6 kernel to 4.18.0-372.26.1.el8_6.

Test-Parameters: trivial fstype=ldiskfs \
clientdistro=el8.6 serverdistro=el8.6 testlist=sanity

Test-Parameters: trivial fstype=zfs \
clientdistro=el8.6 serverdistro=el8.6 testlist=sanity

Change-Id: I45bf6dbff5061407e1109732b6d466d0f7a8376c
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48564
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
18 months agoLU-16144 nrs: implement force mode for nrs_tbf_req_get() 94/48494/5
Etienne AUJAMES [Fri, 9 Sep 2022 06:52:02 +0000 (08:52 +0200)]
LU-16144 nrs: implement force mode for nrs_tbf_req_get()

ptlrpc_service_purge_all() calls ptlrpc_server_request_get() with
"force=true" to purge all active requests before stopping an NRS
policy (when unregistering a service).

"force" mode should always return a request if a pending request is
present in the NRS policy.

nrs_tbf_req_get() does not implement such a mode and can return a
NULL pointer.
This can cause a crash when umounting a target if a TBF rule rate
threshold is reached:

BUG: unable to handle kernel NULL pointer dereference at
0000000000000114
IP: [<ffffffffc0d9e965>] ptlrpc_nrs_req_stop_nolock+0x5/0x150
.....
? ptlrpc_server_finish_active_request+0x2b/0x140 [ptlrpc]
ptlrpc_service_purge_all+0x137/0x920 [ptlrpc]
ptlrpc_unregister_service+0xe7/0x6f0 [ptlrpc]
ost_cleanup+0x52/0x1b0 [ost]
class_free_dev+0x21d/0x720 [obdclass]
class_export_put+0x1f0/0x2c0 [obdclass]
class_unlink_export+0x135/0x170 [obdclass]
class_decref+0x80/0x160 [obdclass]
class_detach+0x1b3/0x2e0 [obdclass]
class_process_config+0x1a38/0x2830 [obdclass]
? complete+0x4a/0x60
? list_del+0xd/0x30
? wait_for_completion+0x4e/0x140
class_manual_cleanup+0x1e0/0x710 [obdclass]
server_stop_servers+0xd5/0x160 [obdclass]
server_put_super+0x12d/0xd00 [obdclass]
generic_shutdown_super+0x6d/0x100

Signed-off-by: Etienne AUJAMES <etienne.aujames@cea.fr>
Change-Id: Ic4443700725d9308764fbf21cb7de6fa4ab41134
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48494
Reviewed-by: Nikitas Angelinas <nikitas.angelinas@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
18 months agoLU-16072 utils: snapshot support to foreign host 26/48226/8
Akash B [Tue, 24 May 2022 05:49:41 +0000 (01:49 -0400)]
LU-16072 utils: snapshot support to foreign host

Currently <foreign> host field in /etc/ldev.conf is unused/ignored,
due to this <lctl snapshot_*> commands do not work when <local>
host is not accessible or if any of the targets are failed over to
<foreign> host. This patch addresses those cases where
<lctl snapshot_{create, destroy, mount, umount, list, modify}>
commands work when the targets are present in <foreign> host.

HPE-bug-id: LUS-10648
Test-Parameters: fstype=zfs testlist=sanity-lsnapshot
Signed-off-by: Akash B <akash-b@hpe.com>
Change-Id: I706c5e43755386eab4facd42ff7a127aa5c9254c
Reviewed-on: https://es-gerrit.dev.cray.com/160702
Tested-by: Alexander Lezhoev <alexander.lezhoev@hpe.com>
Tested-by: Siddarth Raj <siddarth.raj@hpe.com>
Reviewed-by: Dipak Ghosh <dipak.ghosh@hpe.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48226
Reviewed-by: Alexander <alexander.boyko@hpe.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
18 months agoLU-16059 build: Installation of dkms server builds 83/48083/7
Shaun Tancheff [Wed, 24 Aug 2022 14:22:58 +0000 (21:22 +0700)]
LU-16059 build: Installation of dkms server builds

The linux-zfs-dkms package is passing the wrong paths
for zfs [and spl] causing the dkms build to fail.

ZFS_VERSION is not parsed correctly from 'dkms status'.

The splver and zfsver check can match against the wrong
package(s).

lustre-zfs-dkms provides: kmod-lustre-osd-zfs, and
                          lustre-osd-zfs-mount
lustre-ldiskfs-dkms provides: kmod-lustre-osd-ldiskfs and
                              lustre-osd-ldiskfs-mount

In the case of multiple zfs versions installed, build lustre
osd against the highest version number.

HPE-bug-id: LUS-11113
Test-Parameters: trivial
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Ic154ca045427bf26cb7e6a44b8c467675e987aad
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48083
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
18 months agoLU-16125 tests: make sanity-sec more robust with SSK 86/48386/4
Sebastien Buisson [Tue, 30 Aug 2022 09:22:34 +0000 (11:22 +0200)]
LU-16125 tests: make sanity-sec more robust with SSK

Encryption related tests in sanity-sec carry out unmount and mount of
clients in order to exercise code with and without the encryption key.
In case SSK is in use, we need to make sure flavors are properly
applied before carrying on.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I92e85dc6dcef43f70a7fe05db94cd18fe66a3a24
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48386
Reviewed-by: Zhenyu Xu <bobijam@hotmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
18 months agoLU-15777 hsm: set changelog error for restore layout swap failure 21/47121/14
Nikitas Angelinas [Wed, 11 May 2022 22:54:08 +0000 (15:54 -0700)]
LU-15777 hsm: set changelog error for restore layout swap failure

Set the error code in the changelog record generated, if the layout swap
fails at the end of an HSM restore operation. Also, handle error code
overflow inside hsm_set_cl_error(), so that callers don't need to do
this themselves.

Suggested-by: Olaf Weber <olaf.weber@hpe.com>
Suggested-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Signed-off-by: Nikitas Angelinas <nikitas.angelinas@hpe.com>
Change-Id: I4ed2ebffa3bc1c6a0f87ea9f13734e344f77006f
HPE-bug-id: LUS-10863
Test-Parameters: testlist=sanity-hsm,sanity-pcc
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/47121
Reviewed-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
18 months agoLU-15626 tests: Fix "error" reported by shellcheck for functions.sh 34/46834/2
Arshad Hussain [Wed, 16 Mar 2022 08:04:10 +0000 (13:34 +0530)]
LU-15626 tests: Fix "error" reported by shellcheck for functions.sh

This patch fixes "error" issues reported by shellcheck
for functions.sh. This patch also moves spaces to tabs.

Test-Parameters: trivial
Test-Parameters: testlist=sanity,sanityn
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: Iec24ca81b16994c3bfbdc38d8106576a315e0bbd
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/46834
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
18 months agoLU-15619 osc: Remove oap_magic 13/46713/5
Patrick Farrell [Wed, 2 Mar 2022 00:14:03 +0000 (19:14 -0500)]
LU-15619 osc: Remove oap_magic

oap_magic exists only to debug init and allocation
failures, but is allocated for every page of memory, which
wastes a lot of memory for something we don't need
dedicated debug for.

Remove it.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I360e09676f7ba8c3e5296bdf75a6e7f75e91eadb
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/46713
Reviewed-by: Zhenyu Xu <bobijam@hotmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>