Whamcloud - gitweb
fs/lustre-release.git
19 months agoLU-16044 osd: discard pagecache in truncate's declaration 33/48033/14
Alex Zhuravlev [Mon, 25 Jul 2022 13:26:40 +0000 (16:26 +0300)]
LU-16044 osd: discard pagecache in truncate's declaration

to avoid taking pagelock inside a transaction which conflicts
with the write path where we take pagelock before any another one.
this should be safe as the write path writes the pages out
synchronously, so they should be clean by truncate.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: Iba555ace2ce9ef34ab5517375ecb5c176f738a02
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48033
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
19 months agoLU-15451 sec: retry ro mount if read-only flag set 90/47490/6
Sebastien Buisson [Wed, 25 May 2022 14:53:57 +0000 (16:53 +0200)]
LU-15451 sec: retry ro mount if read-only flag set

In case client mount fails with -EROFS because the read-only nodemap
flag is set and ro mount option is not specified, just retry ro mount
internally. This is to avoid the need for users to manually retry the
mount with ro option.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I0dedd1394eeb6804f7fdde930275f6649b935bab
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/47490
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
19 months agoLU-13364 utils: fix bad output for lnetctl import --show 22/43922/2
Cyril Bordage [Fri, 4 Jun 2021 03:40:07 +0000 (05:40 +0200)]
LU-13364 utils: fix bad output for lnetctl import --show

Read the right node from the yaml input ("net type" instead of "net")
to compare to what we find from ioctl when we filter results.

Signed-off-by: Cyril Bordage <cbordage@whamcloud.com>
Change-Id: I9fbbac882f26fd93299f37cca00fcbd4cb7e95d2
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/43922
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
19 months agoLU-14165 utils: llog_reader: display changleog_user records 18/40818/4
Etienne AUJAMES [Tue, 1 Dec 2020 18:10:41 +0000 (19:10 +0100)]
LU-14165 utils: llog_reader: display changleog_user records

Add a function to print changelog_user information.

llog_reader output:

01 (080)changelog user record (v2) id:0x0 cur_id:3 cur_endrec:0
cur_time:1661258371 cur_mask:0x00000003 cur_name:"toto"
...
04 (080)changelog user record (v1) id:0x0 cur_id:6 cur_endrec:0
cur_time:1661261064

Test-Parameters: trivial testlist=sanity,sanity-hsm
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Change-Id: I4e948f52a678127d70e8084e94fb89ec2677cc4b
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/40818
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
19 months agoLU-6142 obdclass: change some foo0() to __foo() 03/48803/2
Mr. NeilBrown [Fri, 7 Oct 2022 12:57:29 +0000 (08:57 -0400)]
LU-6142 obdclass: change some foo0() to __foo()

Change:
  cl_io_init0 -> __cl_io_init
  cl_lock_trace0 -> __cl_lock_trace
  cl_page_delete0 -> __cl_page_delete
  cl_page_state_set0 -> __cl_page_state_set
  cl_page_own0 -> __cl_page_own
  cl_page_disown0 -> __cl_page_disown
  cl_page_delete0 -> __cl_page_delete

This is more consistent with Linux naming style.

Test-Parameters: trivial
Change-Id: If38b52465d42ac425d47c1e9ded62bd7f013e0eb
Signed-off-by: Mr. NeilBrown <neilb@suse.de>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48803
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
19 months agoLU-10391 lnet: support IPv6 in lnet_inet_enumerate() 72/48572/2
Mr NeilBrown [Fri, 16 Sep 2022 00:57:13 +0000 (10:57 +1000)]
LU-10391 lnet: support IPv6 in lnet_inet_enumerate()

lnet_inet_enumerate() can now optionally report IPv6 addresses on
interfaces.  We use this in socklnd to determine the address of the
interface.

Unlike IPv4, different IPv6 addresses associated with a single
interface cannot be associated with different labels (e.g. eth0:2).
This means that lnet_inet_enumerate() must report the same name for
each address.  For now, we only report the first non-temporary address
to avoid any confusion.

The network mask provided with IPv4 is only use for reporting
information for an ioctl.  It isn't clear this will be useful for
IPv6, so no netmask is collected.

To save a bit of space in struct lnet_inetdev{} which much now hold a
16byte address, we replace he 4byte flag with a 1byte bool as only the
IFF_MASTER flag is ever of interest.  Another bool is needed to report
of the address is IPv6.

Test-Parameters: trivial testlist=sanity-lnet
Test-Parameters: serverversion=2.12 serverdistro=el7.9 testlist=runtests
Test-Parameters: clientversion=2.12 testlist=runtests
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I7a73033f40cc83a8993281696f17332a9101db1e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48572
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
19 months agoLU-16002 ptlrpc: reduce pinger eviction time 28/47928/10
Alexander Boyko [Fri, 16 Sep 2022 08:00:38 +0000 (04:00 -0400)]
LU-16002 ptlrpc: reduce pinger eviction time

On a server side eviction is based on PING_INTERVAL. A client
should be evicted after PING_EVICT_TIMEOUT. But eviction logic
adds additional 3 PING_INTERVAL for it. For a configuration
with obd_timeout equal to 300, addition is 225 seconds.
The second level timeout is needed when network is down for
some time. And it prevents clients evictions after first
connection.
Patch adds additional logic to check if an import is active,
and evict client faster without second level. It reduces an
eviction timeout to a PING_EVICT_TIMEOUT.

replay_dual test_0a  is based on a client eviction during recovery,
lfs df check could fail because of eviction. So complete check
similar to recovery-small.sh

Test-Parameters: testlist=recovery-small env=RECOVERY_SMALL_EXCEPT=144 serverversion=2.14
HPE-bug-id: LUS-11054
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: I4d60046ef4737f9cf95a16ac0ab63a36859b8adc
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/47928
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
19 months agoLU-16211 o2iblnd: Avoid NULL md deref 77/48777/3
Chris Horn [Mon, 3 Oct 2022 21:34:11 +0000 (15:34 -0600)]
LU-16211 o2iblnd: Avoid NULL md deref

struct lnet_msg::msg_md is NULL when a router is forwarding a
REPLY. ko2iblnd attempts to access this pointer on the receive path.
This causes a panic.

Test-Parameters: trivial
Fixes: 959304eac7 ("LU-15189 lnet: fix memory mapping.")
HPE-bug-id: LUS-11269
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I0c1dbb1e0bcd3c17b278f358755d465f7bbbb2b0
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48777
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
19 months agoLU-16199 ldiskfs: make ubuntu kernel version detection better 17/48717/2
Ake Sandgren [Mon, 3 Oct 2022 06:39:20 +0000 (08:39 +0200)]
LU-16199 ldiskfs: make ubuntu kernel version detection better

Ubuntu kernel version detection is not working correctly with
official versioning scheme.  There are also a couple of errors in the
AS_VERSION_COMPARE sequences causing problems for 5.4.0 and later.

Signed-off-by: Ake Sandgren <ake.sandgren@hpc2n.umu.se>
Change-Id: Ie6e51de95ae1513b15ee0c2baa8c421f3cb954f5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48717
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
19 months agoLU-16197 kfilnd: Convert NID num to host order 00/48700/2
Chris Horn [Mon, 26 Sep 2022 18:59:38 +0000 (12:59 -0600)]
LU-16197 kfilnd: Convert NID num to host order

The nid_num field in struct lnet_nid is stored in network byte order.
The nid_num field is used to generate the kfabric service string. The
underlying kfabric providers expect the service string to be in host
byte order not network byte order. This mismatch is preventing
multiple LNet NID indexes from being used.

Fix this by converting nid_num to host byte order.

Test-Parameters: trivial
HPE-bug-id: LUS-11254
Change-Id: I804daa6d66d775212a83e3ed013310b383b94974
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Signed-off-by: Ian Ziemba <ian.ziemba@hpe.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48700
Reviewed-by: Ron Gredvig <ron.gredvig@hpe.com>
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
19 months agoLU-16191 socklnd: limit retries on conns_per_peer mismatch 64/48664/3
Serguei Smirnov [Mon, 26 Sep 2022 23:47:24 +0000 (16:47 -0700)]
LU-16191 socklnd: limit retries on conns_per_peer mismatch

If connection initiator has a higher conns-per-peer setting than
its peer, don't try to create extra connections forever as the
peer will keep rejecting them. A few retries should suffice to
resolve a valid race.

Test-Parameters: trivial
Fixes: 71b2476e ("LU-12815 socklnd: add conns_per_peer parameter")
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: I7d04d4ac41e98a738b6c85c3d323608038f5c51e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48664
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
19 months agoLU-15791 tests: Drop local traffic during health test 61/48661/2
Chris Horn [Mon, 26 Sep 2022 15:19:19 +0000 (09:19 -0600)]
LU-15791 tests: Drop local traffic during health test

Existing drop rules for health tests omit local nids for the
destination so it is possible for local NI health values to recover
while the tests execute. Add drop rules for local NIDs to prevent
their health from recovering.

Test-Parameters: trivial
Test-Parameters: testlist=sanity-lnet env=ONLY=205,ONLY_REPEAT=100
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I6a4a06b3fa76effd21e21449abf47cd0e14bbf18
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48661
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
19 months agoLU-16051 o2iblnd: detect link state to set fatal error on ni 44/48644/3
Serguei Smirnov [Fri, 23 Sep 2022 22:20:51 +0000 (15:20 -0700)]
LU-16051 o2iblnd: detect link state to set fatal error on ni

To avoid selecting lnet ni which corresponds to a downed link
for sending, add a mechanism for detecting ip-layer link events
in o2iblnd. On ip link up/down events, find corresponding
ni and toggle ni_fatal_error_on flag. This complements the
existing mechanism for ib-layer link event handling.

Test-Parameters: trivial
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: I4720cd0a7bc577a522c7d40b54f821a4c12b670f
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48644
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
19 months agoLU-16184 o2iblnd: fix deadline for tx on peer queue 40/48640/2
Serguei Smirnov [Fri, 23 Sep 2022 19:29:59 +0000 (12:29 -0700)]
LU-16184 o2iblnd: fix deadline for tx on peer queue

In o2iblnd, deadline is checked for txs on peer queue,
but not set prior to adding the tx to the queue. This
may cause the tx to be dropped unnecessarily with
"Timed out tx for ..." warning.

Fix it by setting the tx_deadline when adding tx to peer queue.

Test-Parameters: trivial
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: Ie7cf5590b440b60f71527049953a64bb31d53578
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48640
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
19 months agoLU-15595 tests: Router test interop check and aarch fix 78/48578/9
Chris Horn [Wed, 14 Sep 2022 01:23:37 +0000 (20:23 -0500)]
LU-15595 tests: Router test interop check and aarch fix

setup_router_test() executes load_lnet() on remote nodes, but
this function was only added in 2.15. Add a version check for it.

Enabling routing may fail on nodes with small amount of memory (like
aarch config). Define small number of router buffers to work around
this issue. Modify the functions which calculate the number of buffers
to allow small sizes to be specified via parameters.

Test-Parameters: trivial testlist=sanity-lnet serverversion=2.12.9
Test-Parameters: testgroup=review-ldiskfs-arm testlist=sanity-lnet
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: If0b76747fe09e883546f18da9f3322c72263e29d
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48578
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
19 months agoLU-13641 socklnd: remove remnants of tcp bonding 68/48568/3
Mr NeilBrown [Thu, 15 Sep 2022 05:32:05 +0000 (15:32 +1000)]
LU-13641 socklnd: remove remnants of tcp bonding

->ksnp_n_passive_ips is now always zero, so remove it and all uses of
it.  ->ksnp_passive_ips is gone too, as is ksocknal_ip2iface().

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I5de6d027c545087c961673d8704f68c4f3dd5076
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48568
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
19 months agoLU-16150 zfs: Fix ZFS(2.1.99-1) build error on CentOS (3.10) 36/48536/5
Arshad Hussain [Tue, 13 Sep 2022 07:31:25 +0000 (03:31 -0400)]
LU-16150 zfs: Fix ZFS(2.1.99-1) build error on CentOS (3.10)

ZFS: (2.1.99-1)
Lustre: 27723374a38 LU-16073 utils: double snapshot_mount fix
CentOS: 3.10.0-1160.15.2.el7.x86_64

This patch fixes build failures seens as below for the
above configuration:

First:
make[4]: Entering directory `/root/lustre01/lustre-release/lustre/utils'
gcc  -rdynamic -shared -export-dynamic -pthread \
-L/root/zfs/zfs_git_lustre_build/zfs//lib/libzfs/.libs/
-L/root/zfs/zfs_git_lustre_build/zfs//lib/libnvpair/.libs/
-L/root/zfs/zfs_git_lustre_build/zfs//lib/libzpool/.libs/ -o
mount_osd_zfs.so \
`ar -t libmount_utils_zfs.a` \
-ldl   -lzfs -lnvpair -lzpool
/usr/bin/ld: cannot find -lzfs
/usr/bin/ld: cannot find -lnvpair
/usr/bin/ld: cannot find -lzpool
collect2: error: ld returned 1 exit status

Test-Parameters: trivial fstype=zfs
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I32f270c7912379f7dce940e0aa2bceee5e49ad79
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48536
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
19 months agoLU-15885 o2iblnd: fix handling of RDMA_CM_EVENT_UNREACHABLE 92/48492/2
Serguei Smirnov [Thu, 8 Sep 2022 22:27:12 +0000 (15:27 -0700)]
LU-15885 o2iblnd: fix handling of RDMA_CM_EVENT_UNREACHABLE

RDMA_CM_EVENT_UNREACHABLE may be received not only when connection
is being connected, but also when it is being closed. Fix handing
of this event accordingly.

Test-Parameters: trivial
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: I79428188c159b2d80d36326589b2977db065d4a7
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48492
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
19 months agoLU-15646 llog: correct llog FID and path output 30/48430/6
Mikhail Pershin [Sat, 3 Sep 2022 07:31:38 +0000 (10:31 +0300)]
LU-15646 llog: correct llog FID and path output

- fix wrong LLOG_ID-to-FID convertion to output llog FID by
  introducing PLOGID macro to expand llog ID for DFID format
- stop printing lgl_ogen along with llog FID as it always zero
  since 2.3.51 and is not used anymore
- output correct path for update llog in llog_reader
- always print header info in llog_reader if available
- print llog flags in header info

Fixes: 5a8e47d0a1a7 ("LU-9153 llog: update llog print format to use FIDs")
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I7ba49e8101a67d2d80c204a5fc629bfd0bce89ad
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48430
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
19 months agoLU-15738 test: check lfsck status before starting 18/48018/3
Hongchao Zhang [Fri, 22 Jul 2022 15:02:24 +0000 (23:02 +0800)]
LU-15738 test: check lfsck status before starting

If the LFSCK has been started before calling "lfsck_start"
to start it, the test shouldn't fail for starting LFSCK.

Test-Parameters: trivial testlist=sanity-lfsck
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Change-Id: I266d9e2b9c5f37eb9e08b489fab428268b90d895
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48018
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
19 months agoLU-15472 ldlm: optimize flock reprocess 57/46257/7
Andriy Skulysh [Fri, 5 Nov 2021 10:55:08 +0000 (12:55 +0200)]
LU-15472 ldlm: optimize flock reprocess

Resource reprocess on flock unlock can be done once
after all pending unlock requests.
It allows to reduce spinlock contention.

Change-Id: I2809070f27fe3af7e1fc34e2b4b22603931f3dff
HPE-bug-id: LUS-10471, LUS-10909
Signed-off-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Alexander Boyko <c17825@cray.com>
Reviewed-by: Vitaly Fertman <c17818@cray.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/46257
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Vitaly Fertman <vitaly.fertman@hpe.com>
Reviewed-by: Alexander <alexander.boyko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
19 months agoLU-10391 lnet: use %pISc for formatting IP addresses 85/48685/2
Mr NeilBrown [Wed, 28 Sep 2022 04:41:47 +0000 (14:41 +1000)]
LU-10391 lnet: use %pISc for formatting IP addresses

The Linux kernel's printf functionality understands %pIS to means that
a the address in a 'struct sockaddr' should be formated, either as
IPv4 or IPv6.  For IPv6, the verbose format showing all 16 bytes
whether zero or not is used.

To get the more familiar "compressed" format where strings of :0000:
are replaced with ::, we need to add the 'c' flag.  This is ignored
for IPv4.

When requesting the port as well ("%pISp), the 'c' and 'p' can appear
in either order.

So this patch changes all %pIS to %pISc as we always want the
compressed format.

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Ida17f5008e06a00c5460cf7161ed07de8fa7a65d
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48685
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
19 months agoLU-16138 kernel: preserve RHEL8.x server kABI for block integrity 08/48608/2
Jian Yu [Tue, 20 Sep 2022 18:19:12 +0000 (11:19 -0700)]
LU-16138 kernel: preserve RHEL8.x server kABI for block integrity

Currently there are two kernel patches supporting SCSI T10-PI feature
left in the RHEL8.x series:

- block-integrity-allow-optional-integrity-functions-rhel8.patch
- block-pass-bio-into-integrity_processing_fn-rhel8.patch

The changes in the patches modified "struct bio_integrity_payload"
and "struct blk_integrity_iter", which caused kABI breakage.

This patch fixes the patches to preserve kABI by using
RH-supplied compatibility macros.

Test-Parameters: trivial fstype=ldiskfs clientdistro=el8.5 serverdistro=el8.5
Test-Parameters: trivial fstype=ldiskfs clientdistro=el8.6 serverdistro=el8.6

Change-Id: If547e1cd4ae4ff1affd315bbfefaeeff4f1dea81
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48608
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
19 months agoLU-9680 obdclass: user netlink to collect devices information 18/31618/80
James Simmons [Sat, 17 Sep 2022 20:19:48 +0000 (16:19 -0400)]
LU-9680 obdclass: user netlink to collect devices information

Our utilities can report to users a device list with various bits
of data using the debugfs file 'devices'. This debugfs file is
only by default available to root which prevents regular users
from collecting information. Enable non-root users to collect
the same information for lctl dl using netlink. The advantage of
using netlink is that it also removes the 8K ioctl limit. Add the
ability to present this data in YAML format as well.

Change-Id: I5e6378765bd2f4c415cf29b2bc54adf0e54f308b
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/31618
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
19 months agoLU-16166 ptlrpc: lower the message level in no resend case 85/48585/2
Yang Sheng [Mon, 19 Sep 2022 05:46:27 +0000 (13:46 +0800)]
LU-16166 ptlrpc: lower the message level in no resend case

Don't report the wrong generation as a error message in
rq_no_resend case.

Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: I534cadc916fcd1eb6840439b6507e646d0e5d974
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48585
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
19 months agoLU-15943 tests: Modify timing of sanity-lnet 210 and 211 80/48580/4
Chris Horn [Wed, 14 Sep 2022 00:47:58 +0000 (19:47 -0500)]
LU-15943 tests: Modify timing of sanity-lnet 210 and 211

The portions of test_210 and test_211 that test the
max_recovery_ping_interval parameter are a little racy because the
window where we can get an accurate ping count is small. This is due
to the tests only being able to sleep for whole seconds vs the more
fine-grained time keeping done in the kernel.

Increase the max interval from 2 to 4 and adjust the expected
ping counts accordingly.

Test-Parameters: trivial
Test-Parameters: testlist=sanity-lnet env=ONLY=210,ONLY_REPEAT=100
Test-Parameters: testlist=sanity-lnet env=ONLY=211,ONLY_REPEAT=100
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Idf8b2ff0d5745bdf4484e75f452bc4f06fbcf1a4
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48580
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
19 months agoLU-16161 kernel: kernel update RHEL8.6 [4.18.0-372.26.1.el8_6] 64/48564/2
Jian Yu [Thu, 15 Sep 2022 18:43:02 +0000 (11:43 -0700)]
LU-16161 kernel: kernel update RHEL8.6 [4.18.0-372.26.1.el8_6]

Update RHEL8.6 kernel to 4.18.0-372.26.1.el8_6.

Test-Parameters: trivial fstype=ldiskfs \
clientdistro=el8.6 serverdistro=el8.6 testlist=sanity

Test-Parameters: trivial fstype=zfs \
clientdistro=el8.6 serverdistro=el8.6 testlist=sanity

Change-Id: I45bf6dbff5061407e1109732b6d466d0f7a8376c
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48564
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
19 months agoLU-16144 nrs: implement force mode for nrs_tbf_req_get() 94/48494/5
Etienne AUJAMES [Fri, 9 Sep 2022 06:52:02 +0000 (08:52 +0200)]
LU-16144 nrs: implement force mode for nrs_tbf_req_get()

ptlrpc_service_purge_all() calls ptlrpc_server_request_get() with
"force=true" to purge all active requests before stopping an NRS
policy (when unregistering a service).

"force" mode should always return a request if a pending request is
present in the NRS policy.

nrs_tbf_req_get() does not implement such a mode and can return a
NULL pointer.
This can cause a crash when umounting a target if a TBF rule rate
threshold is reached:

BUG: unable to handle kernel NULL pointer dereference at
0000000000000114
IP: [<ffffffffc0d9e965>] ptlrpc_nrs_req_stop_nolock+0x5/0x150
.....
? ptlrpc_server_finish_active_request+0x2b/0x140 [ptlrpc]
ptlrpc_service_purge_all+0x137/0x920 [ptlrpc]
ptlrpc_unregister_service+0xe7/0x6f0 [ptlrpc]
ost_cleanup+0x52/0x1b0 [ost]
class_free_dev+0x21d/0x720 [obdclass]
class_export_put+0x1f0/0x2c0 [obdclass]
class_unlink_export+0x135/0x170 [obdclass]
class_decref+0x80/0x160 [obdclass]
class_detach+0x1b3/0x2e0 [obdclass]
class_process_config+0x1a38/0x2830 [obdclass]
? complete+0x4a/0x60
? list_del+0xd/0x30
? wait_for_completion+0x4e/0x140
class_manual_cleanup+0x1e0/0x710 [obdclass]
server_stop_servers+0xd5/0x160 [obdclass]
server_put_super+0x12d/0xd00 [obdclass]
generic_shutdown_super+0x6d/0x100

Signed-off-by: Etienne AUJAMES <etienne.aujames@cea.fr>
Change-Id: Ic4443700725d9308764fbf21cb7de6fa4ab41134
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48494
Reviewed-by: Nikitas Angelinas <nikitas.angelinas@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
19 months agoLU-16072 utils: snapshot support to foreign host 26/48226/8
Akash B [Tue, 24 May 2022 05:49:41 +0000 (01:49 -0400)]
LU-16072 utils: snapshot support to foreign host

Currently <foreign> host field in /etc/ldev.conf is unused/ignored,
due to this <lctl snapshot_*> commands do not work when <local>
host is not accessible or if any of the targets are failed over to
<foreign> host. This patch addresses those cases where
<lctl snapshot_{create, destroy, mount, umount, list, modify}>
commands work when the targets are present in <foreign> host.

HPE-bug-id: LUS-10648
Test-Parameters: fstype=zfs testlist=sanity-lsnapshot
Signed-off-by: Akash B <akash-b@hpe.com>
Change-Id: I706c5e43755386eab4facd42ff7a127aa5c9254c
Reviewed-on: https://es-gerrit.dev.cray.com/160702
Tested-by: Alexander Lezhoev <alexander.lezhoev@hpe.com>
Tested-by: Siddarth Raj <siddarth.raj@hpe.com>
Reviewed-by: Dipak Ghosh <dipak.ghosh@hpe.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48226
Reviewed-by: Alexander <alexander.boyko@hpe.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
19 months agoLU-16059 build: Installation of dkms server builds 83/48083/7
Shaun Tancheff [Wed, 24 Aug 2022 14:22:58 +0000 (21:22 +0700)]
LU-16059 build: Installation of dkms server builds

The linux-zfs-dkms package is passing the wrong paths
for zfs [and spl] causing the dkms build to fail.

ZFS_VERSION is not parsed correctly from 'dkms status'.

The splver and zfsver check can match against the wrong
package(s).

lustre-zfs-dkms provides: kmod-lustre-osd-zfs, and
                          lustre-osd-zfs-mount
lustre-ldiskfs-dkms provides: kmod-lustre-osd-ldiskfs and
                              lustre-osd-ldiskfs-mount

In the case of multiple zfs versions installed, build lustre
osd against the highest version number.

HPE-bug-id: LUS-11113
Test-Parameters: trivial
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Ic154ca045427bf26cb7e6a44b8c467675e987aad
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48083
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
19 months agoLU-16125 tests: make sanity-sec more robust with SSK 86/48386/4
Sebastien Buisson [Tue, 30 Aug 2022 09:22:34 +0000 (11:22 +0200)]
LU-16125 tests: make sanity-sec more robust with SSK

Encryption related tests in sanity-sec carry out unmount and mount of
clients in order to exercise code with and without the encryption key.
In case SSK is in use, we need to make sure flavors are properly
applied before carrying on.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I92e85dc6dcef43f70a7fe05db94cd18fe66a3a24
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48386
Reviewed-by: Zhenyu Xu <bobijam@hotmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
19 months agoLU-15777 hsm: set changelog error for restore layout swap failure 21/47121/14
Nikitas Angelinas [Wed, 11 May 2022 22:54:08 +0000 (15:54 -0700)]
LU-15777 hsm: set changelog error for restore layout swap failure

Set the error code in the changelog record generated, if the layout swap
fails at the end of an HSM restore operation. Also, handle error code
overflow inside hsm_set_cl_error(), so that callers don't need to do
this themselves.

Suggested-by: Olaf Weber <olaf.weber@hpe.com>
Suggested-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Signed-off-by: Nikitas Angelinas <nikitas.angelinas@hpe.com>
Change-Id: I4ed2ebffa3bc1c6a0f87ea9f13734e344f77006f
HPE-bug-id: LUS-10863
Test-Parameters: testlist=sanity-hsm,sanity-pcc
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/47121
Reviewed-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
19 months agoLU-15626 tests: Fix "error" reported by shellcheck for functions.sh 34/46834/2
Arshad Hussain [Wed, 16 Mar 2022 08:04:10 +0000 (13:34 +0530)]
LU-15626 tests: Fix "error" reported by shellcheck for functions.sh

This patch fixes "error" issues reported by shellcheck
for functions.sh. This patch also moves spaces to tabs.

Test-Parameters: trivial
Test-Parameters: testlist=sanity,sanityn
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: Iec24ca81b16994c3bfbdc38d8106576a315e0bbd
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/46834
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
19 months agoLU-15619 osc: Remove oap_magic 13/46713/5
Patrick Farrell [Wed, 2 Mar 2022 00:14:03 +0000 (19:14 -0500)]
LU-15619 osc: Remove oap_magic

oap_magic exists only to debug init and allocation
failures, but is allocated for every page of memory, which
wastes a lot of memory for something we don't need
dedicated debug for.

Remove it.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I360e09676f7ba8c3e5296bdf75a6e7f75e91eadb
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/46713
Reviewed-by: Zhenyu Xu <bobijam@hotmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
19 months agoLU-14108 mount: prevent if --network and discovery 32/46632/6
Cyril Bordage [Fri, 7 Jan 2022 10:08:21 +0000 (11:08 +0100)]
LU-14108 mount: prevent if --network and discovery

The --network= option to mkfs.lustre allows restricting a target
(OST/MDT) to a given LNet network. This makes it register to the MGS
with the specified network only. However, dynamic discovery is unaware
of this restriction and this can create problems.
We prevent mounting with mkfs "network" option if discovery is enabled
by returning an EINVAL error.

Test-Parameters: trivial
Signed-off-by: Cyril Bordage <cbordage@whamcloud.com>
Change-Id: I4b6da7804162192054d7b29a28fbe4cb015e6570
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/46632
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
19 months agoLU-10973 lnet: Various test cleanups 15/45915/5
Amir Shehata [Wed, 22 Dec 2021 05:42:32 +0000 (21:42 -0800)]
LU-10973 lnet: Various test cleanups

Cleaning up some of the LUTF test failures

Test-Parameters: @lnet
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I529d3f171357255d04991293a5df4c7b41622d07
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/45915
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: jsimmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
19 months agoLU-10973 lnet: LUTF UDSP test suite and routing test suite 77/39777/45
Serguei Smirnov [Mon, 31 Aug 2020 22:35:52 +0000 (18:35 -0400)]
LU-10973 lnet: LUTF UDSP test suite and routing test suite

Added the UDSP suite and routing suite to the LUTF test cases.

Updated some of the infrastructure scripts with methods needed
for the new test cases.

Test-Parameters: @lnet
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: Ibd74cea48982ccafc3b1d5034a409fd2df9e7b1c
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/39777
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: jsimmons <jsimmons@infradead.org>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
19 months agoLU-10973 lnet: LUTF Multi-Rail test suite 58/39458/54
Amir Shehata [Mon, 20 Jul 2020 21:04:32 +0000 (14:04 -0700)]
LU-10973 lnet: LUTF Multi-Rail test suite

Added a test suite which covers various Multi-Rail functionality.

Test-Parameters: @lnet
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I0480e59ebd97c943669194acbb1c80222e202a6e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/39458
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: jsimmons <jsimmons@infradead.org>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
19 months agoLU-10973 lnet: LUTF dynamic discovery test suite 95/39195/58
Amir Shehata [Sat, 27 Jun 2020 04:15:04 +0000 (21:15 -0700)]
LU-10973 lnet: LUTF dynamic discovery test suite

Add the dynamic discovery test suite to the LUTF test cases.

Updated some of the infrastructure scripts with methods needed
for the DD test cases

Test-Parameters: @lnet
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I0cfef4ae6f88b4deca12f1a3d5ef3291137a6c04
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/39195
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: jsimmons <jsimmons@infradead.org>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
19 months agoLU-10360 tests: test dynamic NIDs feature 11/39911/42
Amir Shehata [Tue, 15 Sep 2020 01:18:47 +0000 (18:18 -0700)]
LU-10360 tests: test dynamic NIDs feature

Add five LUTF test cases to test the following:
1. Enabling/Disabling dynamic_nids module parameter.
2. Allow clients to continue using servers which have changed
   their IP address during a boot cycle.
3. Verify feature is disabled if dynamic_nids module parameter
   is not set.
4. Verify feature is disabled if the dynamic_nids module parameter
   is asymmetrically set.

Test-Parameters: @lnet
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I481c2ae938d07398f6b40af2a1a1db039168ede7
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/39911
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: jsimmons <jsimmons@infradead.org>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
19 months agoLU-10973 lnet: LUTF DLC test suite and sample test suite 08/40108/40
Serguei Smirnov [Wed, 30 Sep 2020 22:52:39 +0000 (18:52 -0400)]
LU-10973 lnet: LUTF DLC test suite and sample test suite

Add the DLC test suite and sample test suite to LUTF test cases.

Test-Parameters: @lnet
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: Ic7579023cfaf796fd40d6e12434137fb3ec5b0e4
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/40108
Reviewed-by: jsimmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
19 months agoLU-10391 lnet: only use PUBLIC IP6 addresses for connections 71/48571/3
Mr NeilBrown [Fri, 16 Sep 2022 00:49:51 +0000 (10:49 +1000)]
LU-10391 lnet: only use PUBLIC IP6 addresses for connections

IPv6 can have temporary address.  These can be used for short-lives
outgoing connections to increase privacy.  They are not suitable for
long-term connections.

So request that only PUBLIC IPv6 addresses are used when making a
connection.

Test-Parameters: trivial testlist=sanity-lnet
Test-Parameters: serverversion=2.12 serverdistro=el7.9 testlist=runtests
Test-Parameters: clientversion=2.12 testlist=runtests
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I1414d9ea11cd5873438a4c088884cefd7d933c8c
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48571
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: jsimmons <jsimmons@infradead.org>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
19 months agoLU-10391 socklnd: support IPv6 in ksocknal_ip2index() 70/48570/2
Mr NeilBrown [Thu, 15 Sep 2022 05:09:59 +0000 (15:09 +1000)]
LU-10391 socklnd: support IPv6 in ksocknal_ip2index()

ksocknal_ip2index() can now find the interface index for an IPv6
address.

Test-Parameters: trivial testlist=sanity-lnet
Test-Parameters: serverversion=2.12 serverdistro=el7.9 testlist=runtests
Test-Parameters: clientversion=2.12 testlist=runtests
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Idd6bee5c9db417b05f8208ab5ab309f4c8404d54
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48570
Reviewed-by: jsimmons <jsimmons@infradead.org>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
19 months agoLU-10391 lnet: add iface index to struct lnet_inetdev 69/48569/2
Mr NeilBrown [Thu, 15 Sep 2022 01:47:55 +0000 (11:47 +1000)]
LU-10391 lnet: add iface index to struct lnet_inetdev

When getting list of interfaces, get the index as well, as this can be
useful and avoid search the list of interfaces again to find it.

Test-Parameters: trivial testlist=sanity-lnet
Test-Parameters: serverversion=2.12 serverdistro=el7.9 testlist=runtests
Test-Parameters: clientversion=2.12 testlist=runtests
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I9b3b2516fd4ec1b83e2ec31e1318326ed22cb31b
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48569
Reviewed-by: jsimmons <jsimmons@infradead.org>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
19 months agoLU-12511 utils: make kfilnd support a soft requirement 18/48518/5
James Simmons [Sat, 17 Sep 2022 15:45:12 +0000 (11:45 -0400)]
LU-12511 utils: make kfilnd support a soft requirement

The new kfilnd driver doesn't exist upstream and looks like it
will be missing upstream for sometime. Make building the code
for this new LND optional which is needed for the native Linux
Lustre client.

Test-Parameters: trivial
Change-Id: Ib17f78b12ffed95e4198d4524f5ca44aab01c010
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48518
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
19 months agoLU-10391 lnet: track pinginfo size in bytes, not nis. 27/44627/17
Mr NeilBrown [Sun, 14 Aug 2022 21:37:23 +0000 (17:37 -0400)]
LU-10391 lnet: track pinginfo size in bytes, not nis.

When we extend the pinginfo to be able to store large-address nids,
there could be nids of different sizes in it.  So using the number of
nis to track the size won't work.  So change to using the number of
bytes.  i.e.  the total size of the 'struct lnet_ping_info'.

This affects pb_nnis in the ping_buffer, and the global
ln_push_target_nnis.

LNET_PING_INFO_SIZE is removed as size won't depend on number of nids
any more.

When determining the number of bytes expected in a received ping_info,
use a new macro lnet_ping_info_size() which can extract information
as required from the ping_info.

Note that lnet_ping_target_create() now initializes pi_nis to 0.
Setting the initial size doesn't seem to be useful.

Test-Parameters: trivial testlist=sanity-lnet
Test-Parameters: serverversion=2.12 serverdistro=el7.9 testlist=runtests
Test-Parameters: clientversion=2.12 testlist=runtests
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I7727b784ed9a7510959d5ec41f8df3851adb78ed
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/44627
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: jsimmons <jsimmons@infradead.org>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
19 months agoLU-16135 lod: prohibit DoM pattern in plain layout 33/48433/3
Mikhail Pershin [Mon, 5 Sep 2022 07:41:37 +0000 (10:41 +0300)]
LU-16135 lod: prohibit DoM pattern in plain layout

DoM pattern can be set as default directory plain layout by
older LFS version. It misses DoM component sanity checks if
plain layout is used. Such layout is not allowed and causes
later crashed when file is created under that directory.

While LFS can prevent this but not in all Lustre versions,
so LOD should do the check as well

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: Ic58fdda2ab3e63083128cb6cf949fcb43ccd2c02
Reviewed-on: https://review.whamcloud.com/48433
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
19 months agoLU-16160 osc: take ldlm lock when queue sync pages 57/48557/2
Bobi Jam [Thu, 15 Sep 2022 06:46:34 +0000 (14:46 +0800)]
LU-16160 osc: take ldlm lock when queue sync pages

osc_queue_sync_pages() add osc_extent to osc_object's IO extent
list without taking ldlm locks, and then it calls
osc_io_unplug_async() to queue the IO work for the client.

This patch make sync page queuing take ldlm lock in the
osc_extent.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Idefa2981e62a2a6e10d8b8a7692c0337b61b9052
Reviewed-on: https://review.whamcloud.com/48557
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
19 months agoLU-16123 checkpatch: Suppress false warning 75/48375/2
Arshad Hussain [Mon, 29 Aug 2022 10:51:45 +0000 (16:21 +0530)]
LU-16123 checkpatch: Suppress false warning

checkpatch throws a warning if it finds an "UPPERCASE"
on the left and side. According to the script/code it
is to avoid cases like "foo + BAR < baz".

Warnings example:
(style)  Comparisons should place the constant on \
the right side of the test

However for our case which throws a warning as false
positive.

"#if LUSTRE_VERSION_CODE < OBD_OCD_VERSION(3, 0, 53, 0)
...
"#endif

This patch suppresses the warning thrown by above
code only. This is not a generic "left hand" upper-case
warning suppressor which can be a genuine error. This
only handles the case where the left side is
LUSTRE_VERSION_CODE upper-case macro.

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: Ic8d8fccae035ba6e2ea28099bea6f163ceb0da0a
Reviewed-on: https://review.whamcloud.com/48375
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
19 months agoLU-16154 obdclass: free inst_name correctly 42/48542/4
Emoly Liu [Thu, 15 Sep 2022 01:42:47 +0000 (09:42 +0800)]
LU-16154 obdclass: free inst_name correctly

In functon class_config_llog_handler(), inst_name should be freed
correctly before break.

Signed-off-by: Emoly Liu <emoly@whamcloud.com>
Change-Id: I6adc0ed62c3c637237834b799f25666d0e7e1ecb
Reviewed-on: https://review.whamcloud.com/48542
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Zhenyu Xu <bobijam@hotmail.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
19 months agoLU-16153 tests: add version check to conf-sanity.sh test_133 41/48541/2
Emoly Liu [Wed, 14 Sep 2022 02:38:03 +0000 (10:38 +0800)]
LU-16153 tests: add version check to conf-sanity.sh test_133

conf-sanity.sh test_133 from the patch at
https://review.whamcloud.com/38136 has been landed since 2.15.51.
To avoid any interop failure, a version check is added there.

Test-Parameters: trivial
Signed-off-by: Emoly Liu <emoly@whamcloud.com>
Change-Id: Ic5c142faa6f61fe83ce86e67a7cee8d8b183cdaf
Reviewed-on: https://review.whamcloud.com/48541
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
19 months agoLU-14992 tests: sanity/replay-vbr mkdir on MDT0 02/44902/9
James Nunez [Mon, 13 Sep 2021 16:35:30 +0000 (10:35 -0600)]
LU-14992 tests: sanity/replay-vbr mkdir on MDT0

Replace mkdir with mkdir_on_mdt0() for sanity test 133a
and relay-vbr test 7a.  These tests expect the newly
created directory is on MDT0.

Test-Parameters: trivial mdscount=2 mdtcount=4 testlist=sanity
Test-Parameters: env=SLOW=yes mdscount=2 mdtcount=4 testlist=replay-vbr
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: Icea2923a8d8d3a3aa0ddf0401f0a025480b2f6f0
Reviewed-on: https://review.whamcloud.com/44902
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Kevin Zhao <kevin.zhao@linaro.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
19 months agoLU-16057 obdclass: set OBD_MD_FLGROUP for ladvise RPC 80/48080/2
Li Dongyang [Fri, 29 Jul 2022 06:35:41 +0000 (16:35 +1000)]
LU-16057 obdclass: set OBD_MD_FLGROUP for ladvise RPC

ladvise RPC doesn't have OBD_MD_FLGROUP set, when RPC
reaches server, tgt_validate_obdo() will corrupt the FID
if it's seq is in FID_SEQ_NORMAL range.

Do not mess with seq in obdo_to_ioobj() and tgt_validate_obdo(),
since 2.0 all RPCs should have OBD_MD_FLGROUP set.

Add OBD_MD_FLGROUP for ladvise RPC to fix new client talking
to old servers.

Change-Id: I373b7f32458b18e29d9bb716a912fe4a54eccac5
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-on: https://review.whamcloud.com/48080
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
19 months agoLU-15986 ptlrpc: protect rq_repmsg in ptlrpc_req_drop_rs() 39/47839/14
Lei Feng [Thu, 30 Jun 2022 02:46:31 +0000 (10:46 +0800)]
LU-15986 ptlrpc: protect rq_repmsg in ptlrpc_req_drop_rs()

There is a race condition that: on server side, one thread sent
reply message and is deleting the reply message, another is
searching for existing request and print some debug information
in _debug_req() if there is a duplicated request. They both operate on
req->rq_repmsg but it is not protected in ptlrpc_req_drop_rs().
So we protected it with req->rq_early_free_lock.

Signed-off-by: Lei Feng <flei@whamcloud.com>
Change-Id: Ied55427ee15c3ef84bdd2d579844eba398dbf010
Reviewed-on: https://review.whamcloud.com/47839
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoNew tag 2.15.52 2.15.52 v2_15_52
Oleg Drokin [Sat, 17 Sep 2022 06:27:08 +0000 (02:27 -0400)]
New tag 2.15.52

Change-Id: I7425fd5ea8f382a10ea2574933257fcd41407fa2
Signed-off-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-16145 lnet: Honor peer timeout of zero 89/48489/4
Chris Horn [Fri, 2 Sep 2022 16:47:02 +0000 (11:47 -0500)]
LU-16145 lnet: Honor peer timeout of zero

Zero is a valid value for the peer_timeout parameter (it is supposed
to disable the LNet Peer Health feature used on routers), but DLC
treats zero as uninitialized and assigns the default peer timeout
instead.

Test-Parameters: trivial testlist=sanity-lnet
HPE-bug-id: LUS-11233
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I66f45ddf282757f46c0169ae0e725e56234d3d89
Reviewed-on: https://review.whamcloud.com/48489
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-16131 build: Do not depend on libmount during --enable-dist 07/48407/2
Shaun Tancheff [Thu, 1 Sep 2022 14:46:16 +0000 (21:46 +0700)]
LU-16131 build: Do not depend on libmount during --enable-dist

Defer the libmount requirement when using --enable-dist to
generate the lustre-src.rpm.

This allows mock and/or yum build-deps to resolve resolve
dependencies and pickup the libmount requirement without changing
the existing minimal build.

Test-Parameters: trivial
HPE-bug-id: LUS-11091
Fixes: f21b944127 ("LU-15940 build: add a required dependency for libmount")
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I20a7a097f9b651b6ea5519f79efda6c96b6f2199
Reviewed-on: https://review.whamcloud.com/48407
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-16089 kernel: kernel update RHEL 7.9 [3.10.0-1160.76.1.el7] 02/48202/3
Jian Yu [Fri, 12 Aug 2022 01:29:05 +0000 (18:29 -0700)]
LU-16089 kernel: kernel update RHEL 7.9 [3.10.0-1160.76.1.el7]

Update RHEL 7.9 kernel to 3.10.0-1160.76.1.el7.

Test-Parameters: trivial clientdistro=el7.9 serverdistro=el7.9

Change-Id: I97d087a5d5bb27996a5c0caf382c011928c651b4
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/48202
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-16029 utils: add options to lr_reader to parse raw files 88/47988/8
Etienne AUJAMES [Tue, 19 Jul 2022 20:21:52 +0000 (22:21 +0200)]
LU-16029 utils: add options to lr_reader to parse raw files

Add the following usages to lr_reader for post-mortem debuging:

debugfs -c -R "dump reply_data /tmp/reply_data" /dev/mapper/mds1
debugfs -c -R "dump last_rcvd /tmp/last_rcvd" /dev/mapper/mds1

lr_reader -cr -C /tmp/last_rcvd -R /tmp/reply_data
....

This patch attempts to re-refactoring lr_reader code.

It enable to use longer device name (by removing the limitation on
the 128 bytes buffer of debugfs command).

Signed-off-by: Etienne AUJAMES <etienne.aujames@cea.fr>
Change-Id: I6a5f945134d4235ac467ba2274eb05f71b468cd8
Reviewed-on: https://review.whamcloud.com/47988
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: DELBARY Gael <gael.delbary@cea.fr>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-16106 lnet: allow direct messages regardless of peer NI status 55/48355/5
Serguei Smirnov [Sun, 28 Aug 2022 01:50:16 +0000 (18:50 -0700)]
LU-16106 lnet: allow direct messages regardless of peer NI status

If check_routers_before_use is enabled, the router needs to
be pinged before it is used, which is not possible because
its NIs are assumed to be down at start-up. Don't prevent
discovery of the router in this case.

This change allows non-routed traffic to peer NIs with "down"
status.

Test-Parameters: trivial
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: I36fa60e37ef4f47c82c69855c9b0b80bad8a36f4
Reviewed-on: https://review.whamcloud.com/48355
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-16050 build: replace ofed_info with dpkg/rpm 47/48047/4
Jian Yu [Wed, 27 Jul 2022 22:45:49 +0000 (15:45 -0700)]
LU-16050 build: replace ofed_info with dpkg/rpm

After installing MLNX_OFED by running mlnxofedinstall command,
mlnx-ofed-kernel-modules package is not listed by ofed_info,
which causes Lustre configure fail as follows:

checking whether to use Compat RDMA... /usr/bin/ofed_info
dpkg-query: error: --listfiles needs at least one package name argument

This patch fixes the above issue by replacing ofed_info with
"dpkg -l" and "rpm -qa" commands to find OFED package.

Test-Parameters: trivial
Fixes: ec03c9628cae ("LU-15417 build: find the new path for MOFED 5.5")
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Change-Id: Ia3c2d6bf10e147ca2761221741eff6f93008556c
Reviewed-on: https://review.whamcloud.com/48047
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Gaurang Tapase <gtapase@ddn.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-16002 ptlrpc: adds configurable ping interval 82/47982/3
Alexander Boyko [Sun, 10 Jul 2022 14:25:21 +0000 (10:25 -0400)]
LU-16002 ptlrpc: adds configurable ping interval

The patch adds ability to change ping interval and eviction
mutliplier. A default values stay as before.
Example
lctl set_param ping_interval=10
lctl set_param evict_multiplier=5

HPE-bug-id: LUS-11054
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: I012dc7ba28ce9ff3edf0f145a403679bfaebbf55
Reviewed-on: https://review.whamcloud.com/47982
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-14719 osp: add inode watermark 28/47128/15
Lai Siyao [Fri, 1 Apr 2022 19:58:08 +0000 (15:58 -0400)]
LU-14719 osp: add inode watermark

* move block watermark from debugfs to sysfs.
* add inode watermark for OSP.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I7c768fa2ebfb4b8c2f75255f9e9c061d4c15cf66
Reviewed-on: https://review.whamcloud.com/47128
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-16082 ldiskfs: old-style EA inode fix for el8.5/el8.6 96/48496/4
Andreas Dilger [Fri, 9 Sep 2022 08:17:09 +0000 (08:17 +0000)]
LU-16082 ldiskfs: old-style EA inode fix for el8.5/el8.6

Add the rhel8/ext4-old_ea_inodes_handling_fix.patch to the ldiskfs
series for el8.5 and el8.6 kernels.

Test-Parameters: trivial testlist=sanity serverdistro=el8.6
Fixes: 76c3fa96dc30 ("LU-16082 ldiskfs: old-style EA inode handling fix")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ifb66a0b7d78e5153d7897bee45fbf1d0e58fbc5c
Reviewed-on: https://review.whamcloud.com/48496
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-14642 flr: allow layout version update from client/MDS 43/45443/21
Bobi Jam [Mon, 25 Oct 2021 08:45:29 +0000 (16:45 +0800)]
LU-14642 flr: allow layout version update from client/MDS

Client write request always carries its layout version so
that OFD can reject the request if the carried layout version
is a stale one.

This patch makes OFD allow layout version change request from
client as well as MDS. And during resync write, all OST objects
will get layout version updated.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I655044f69a4509a2b0cfe99f86de2ce4ee846979
Reviewed-on: https://review.whamcloud.com/45443
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-16140 lnet: revert "LU-16011 lnet: use preallocate bulk for server" 57/48457/3
Andreas Dilger [Wed, 7 Sep 2022 19:13:11 +0000 (19:13 +0000)]
LU-16140 lnet: revert "LU-16011 lnet: use preallocate bulk for server"

This reverts commit 2447564e120cf622627a5ab81051657f6ce5ece2 due to OOM
on aarch64 clients.

Change-Id: Icfa7d520c36d497566f3e2d154a2065a9aab8da2
Test-Parameters: trivial testlist=lnet-selftest clientarch=aarch64
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/48457
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
20 months agoLU-16073 utils: double snapshot_mount fix 25/48225/5
Akash B [Thu, 11 Aug 2022 07:51:57 +0000 (03:51 -0400)]
LU-16073 utils: double snapshot_mount fix

lsnapshot_mount on already mounted snapshot fs
results in umount of snapshot fs and the following
error is seen:

-> lsnapshot_mount -F testfs -n snap_test_fo
Can't mount the snapshot snap_test_fo: No such process

Add additional test to the existing sanity-lsnapshot.sh
(test_1b) to reproduce the above issue.

This is handled by returning appropriate error
code and return -EALREADY if snapshot fs is
already mounted.

HPE-bug-id: LUS-10650
Test-Parameters: fstype=zfs testlist=sanity-lsnapshot
Signed-off-by: Akash B <akash-b@hpe.com>
Change-Id: Ia13c3e1cf929ec7c53463a2ea74eb98fb46f8358
Reviewed-on: https://es-gerrit.dev.cray.com/160589
Reviewed-by: Dipak Ghosh <dipak.ghosh@hpe.com>
Reviewed-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-on: https://review.whamcloud.com/48225
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Alexander <alexander.boyko@hpe.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-16062 ldlm: improve bl_timeout for prolong 94/48094/2
Vitaly Fertman [Fri, 28 Aug 2020 19:17:58 +0000 (22:17 +0300)]
LU-16062 ldlm: improve bl_timeout for prolong

If there is a client's RPC in hand, we can do a better job for
calculating the lock callback timeout as RPC has the info what
client thinks about this RPC timeout. Let's use it.

HPE-bug-id: LUS-8866, LUS-11074
Signed-off-by: Vitaly Fertman <c17818@cray.com>
Change-Id: Ibd67d37c1073d0d3cb2e08b532c801af0de116fe
Reviewed-on: https://es-gerrit.dev.cray.com/157782
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Tested-by: Jenkins Build User <nssreleng@cray.com>
Reviewed-on: https://review.whamcloud.com/48094
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-by: Andriy Skulysh <andriy.skulysh@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-15791 tests: Get health before removing drop rules 98/47998/3
Chris Horn [Wed, 20 Jul 2022 15:44:39 +0000 (09:44 -0600)]
LU-15791 tests: Get health before removing drop rules

lnet_health_post() can race with recovery pings, so we should
wait to delete the drop rules until after we've gathered the
health and resend values.

Test-Parameters: trivial testlist=sanity-lnet
Fixes: 79ab053562 ("LU-13569 lnet: Deprecate lnet_recovery_interval")
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Ia7595e015809f796cafcc40382d98ab66a708a49
Reviewed-on: https://review.whamcloud.com/47998
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-15833 llapi: don't use realpath in llapi_search_fsname() 58/47258/10
Etienne AUJAMES [Mon, 9 May 2022 13:44:29 +0000 (15:44 +0200)]
LU-15833 llapi: don't use realpath in llapi_search_fsname()

This patch use st_dev value to dertermine the fsname in
llapi_search_fsname().
The main purpose of this is to limit the number of lstat()
(realpath()) in this function.

get_root_path() is modified to search a mountpoint by dev.
And the last results of get_root_path() is cached to avoid reading
/proc/mount for each call.

A new api function llapi_search_rootpath_by_dev() is added to get
the path of Lustre mountpoint using the specified device value.

**Testing:**

*Environement:*
VMs: 1 client, 1 MDS (2MDT), 1 OSS (2 OST)
Lustre tree: test{001..100}/test{001..100}/test{01..10}/file{01..05}
(500000 files + 110100 folders)
OS: Centos 7 (no statx)
Lustre: 2.15.50_15_g1116739

*Tests*
cd <rootfs>
strace lfs getstripe -r .
echo 3 > /proc/sys/vm/drop_caches
/usr/bin/time lfs getstripe -r . (2 iterations)

*Results*
times (s):

                 ______________________________
                | user | system | real | real% |
 _______________|______|________|______|_______|
|without patch: | 6.18 | 57.3   | 427  | 0%    |
|_______________|______|________|______|_______|
|with patch:    | 2.88 | 47.3   | 404  |-5.45% |
|_______________|______|________|______|_______|

strace (only significant changes are displayed):
(*stat = lstat + stat + fstat)
                 _____________________________________________
                | *stat  | mmap   | open   | read   | all     |
 _______________|________|________|________|________|_________|
|without patch: | 760545 | 110142 | 330379 | 330325 | 4742658 |
|_______________|________|________|________|________|_________|
|with patch:    | 440484 | 0      | 220277 | 19     | 3541739 |
|_______________|________|________|________|________|_________|

-25.32% syscalls after patching.

Signed-off-by: Etienne AUJAMES <etienne.aujames@cea.fr>
Change-Id: I3812d922d5b1d194d52132cba95d11820424c5d7
Reviewed-on: https://review.whamcloud.com/47258
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-15911 enc: null encrypted names is embedded llcrypt only 20/47520/6
Sebastien Buisson [Fri, 3 Jun 2022 09:16:35 +0000 (11:16 +0200)]
LU-15911 enc: null encrypted names is embedded llcrypt only

enable_filename_encryption tunable only makes sense when Lustre client
is built against embedded llcrypt. When built against in-kernel
fscrypt, this tunable is silently ignored, as fscrypt always carries
out file name encryption.

So have the enable_filename_encryption tunable only when Lustre client
is built against embedded llcrypt. Also fix sanity-sec test_54 so that
it works for in-kernel fscrypt.

Fixes: e68d496ada ("LU-15858 sec: reinstate null encryption for file names")
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Ibe52feb670a00c9f421907ecd438bcccc62856f0
Reviewed-on: https://review.whamcloud.com/47520
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-15732 test: don't set RSYNC_SSH=rsh 32/47032/2
John L. Hammond [Mon, 11 Apr 2022 15:11:55 +0000 (10:11 -0500)]
LU-15732 test: don't set RSYNC_SSH=rsh

Let rsync use ssh (the default) in test-framework.sh.

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: I31fafc72b476070f0a16c1578bc014cc68e21424
Reviewed-on: https://review.whamcloud.com/47032
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: James Casper <jcasper@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-15619 osc: Remove submit time 12/46712/5
Patrick Farrell [Wed, 2 Mar 2022 00:08:02 +0000 (19:08 -0500)]
LU-15619 osc: Remove submit time

The osc page submit time is an unused bit of debugging
information, but it's allocated for every page.  Let's
just remove it to save memory.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I160d38039332cb17e07735b60ce7979626ed43dc
Reviewed-on: https://review.whamcloud.com/46712
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-10994 osc: remove oap_cli 03/47403/5
John L. Hammond [Thu, 19 May 2022 18:46:26 +0000 (13:46 -0500)]
LU-10994 osc: remove oap_cli

Remove the redundant oap_cli member from struct osc_async_page.

...:(cl_page.c:216:__cl_page_alloc()) slab-alloced 'cl_page': 256 at 000000009ab84b37.

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: Idd088f0906a10773568495933592ac5e755dc047
Reviewed-on: https://review.whamcloud.com/47403
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-10994 clio: remove cpl_obj 02/47402/5
John L. Hammond [Thu, 19 May 2022 18:29:58 +0000 (13:29 -0500)]
LU-10994 clio: remove cpl_obj

Remove cpl_obj from struct cl_page_slice. This member is only used in
the osc layer and struct osc_page already contains a pointer to the
osc_object.

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: I6451aa50ff0e8db67f1c6f4f7edbde4fa8d36c5b
Reviewed-on: https://review.whamcloud.com/47402
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-10994 clio: remove unused convenience functions 01/47401/5
John L. Hammond [Thu, 19 May 2022 18:06:30 +0000 (13:06 -0500)]
LU-10994 clio: remove unused convenience functions

Remove the unused convenience functions cl_page_top(), cl_page_at(),
cl_page_at_trusted(), and cl2vm_page().

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: I9c994d8f4c81bc93383a9eb46def514685a27690
Reviewed-on: https://review.whamcloud.com/47401
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-10994 clio: remove struct vvp_page 00/47400/5
John L. Hammond [Mon, 11 Jul 2022 14:04:12 +0000 (10:04 -0400)]
LU-10994 clio: remove struct vvp_page

Remove struct vvp_page and use struct cl_page_slice in its place. Use
cp_vmpage in place of vpg_page and cl_page_index() in place of
vvp_index().

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: I2cd408f08e6ff9f7686b591c02ea95e31ad2b2ae
Reviewed-on: https://review.whamcloud.com/47400
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-10994 clio: remove cpo_prep and cpo_make_ready 99/47399/6
John L. Hammond [Mon, 22 Aug 2022 15:56:04 +0000 (11:56 -0400)]
LU-10994 clio: remove cpo_prep and cpo_make_ready

Remove the cpo_prep and cpo_make_ready methods from struct
cl_page_operations. These methods were only implemented by the vvp
layer and so they can be easily inlined into cl_page_prep() and
cl_page_make_ready().

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: I177fd8d3c3832bcc8f06ed98cdf9d30f18d49e88
Reviewed-on: https://review.whamcloud.com/47399
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-10994 clio: remove vvp_page_print() 98/47398/6
John L. Hammond [Thu, 19 May 2022 16:07:01 +0000 (11:07 -0500)]
LU-10994 clio: remove vvp_page_print()

Remove vvp_page_print() by placing equivalent code in cl_page_print().

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: I815c4f63dc6fe57eec0987f209a2f34d3ff58146
Reviewed-on: https://review.whamcloud.com/47398
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-15595 lnet: Always use ping reply to set route lr_alive 24/46624/11
Chris Horn [Wed, 27 Oct 2021 20:10:17 +0000 (20:10 +0000)]
LU-15595 lnet: Always use ping reply to set route lr_alive

We currently process discovery ping replies in different ways
depending on whether the gateway has discovery enabled or disabled
(or the local peer doing the processing has discovery enabled or
disabled).

When DD is disabled we process the ping reply to set the lr_alive
field of lnet_route because the peer objects for non-MR routers do
not contain all the information needed to calculate the route
aliveness when a message is being sent.

When DD is enabled then we don't do any special processing of the
ping reply. We simply let discovery update the NI status for the
GW's peer NIs and then we calculate the route aliveness on every
send.

We issue discovery pings to routers every alive_router_check_interval
seconds (default 60), but we calculate route aliveness on every send
to a remote network (1000s of times per seconds). Thus, it is better
to slightly duplicate the effort expended when we receive a discovery
reply so that we can avoid calculating route aliveness on every send.

Since both lr_alive and hop type are being set on each ping reply, for
both DD enabled and disabled cases, we can remove the code for
updating lr_alive and hop type from lnet_router_discovery_complete().

If discover encounters a fatal error, we still set the status of each
peer NI, as well as all routes, to down in
lnet_router_discovery_complete().

Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: If4838c269a89885ba3763f62847e294804edf62e
Reviewed-on: https://review.whamcloud.com/46624
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andriy Skulysh <andriy.skulysh@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-15524 mdd: trigger changelog GC by free space 67/46467/12
Mikhail Pershin [Mon, 7 Feb 2022 10:12:29 +0000 (13:12 +0300)]
LU-15524 mdd: trigger changelog GC by free space

if amount of space consumed by changelog become comparable
with system free space then start emergency GC for changelog
by purging the oldest user

Such behavior is enabled by default and can be disabled via
mdd_changelog_free_space_gc parameter

Test 160t is added to sanity.sh

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: Ia63cc71e708b0f10cdf54f45f0809c0e86950101
Reviewed-on: https://review.whamcloud.com/46467
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-16052 llog: handle -EBADR for catalog processing 70/48070/3
Mikhail Pershin [Fri, 29 Jul 2022 08:24:15 +0000 (11:24 +0300)]
LU-16052 llog: handle -EBADR for catalog processing

Llog catalog processing might retry to get the last llog block
to check for new records if any. That might return -EBADR code
which should be considered as valid. Previously -EIO was
returned in all cases.

Run conf-sanity test_106 several times as specific test

Test-Parameters: testlist=conf-sanity env=ONLY=106,SLOW=yes,ONLY_REPEAT=10
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I30e04ba2c91c8bdce72c95675a1209639e9f0570
Reviewed-on: https://review.whamcloud.com/48070
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-15259 tests: use existing usernames for setfacl 27/45627/34
Andreas Dilger [Wed, 31 Aug 2022 07:51:41 +0000 (00:51 -0700)]
LU-15259 tests: use existing usernames for setfacl

In SLES15.2 and Ubutntu 20 the "bin" and "daemon" users are not
defined in /etc/passwd, causing setfacl to print a cryptic error:

  setfacl -m u:bin:rw f -- failed
  ~     ? setfacl: Option -m: Invalid argument near character 3

Replace "bin" and "daemon" in ACL tests so they are run with user
and group names that exist on all distros currently being tested.
They can also be specified via ACLUSR1/ACLUSR2 in the test config.

The "permission_xattr" test also needs "nobody" user and group.

Also, the "getfacl" command prints users and groups in numerical
order, so the ACL tests will fail if "daemon" < "bin", or if either
group is higher than the "users" group.  Fix them as needed.

Test-Parameters: trivial testlist=sanity-quota,sanity-sec,pjdfstest
Test-Parameters: testlist=sanity env=ONLY=103-154 clientdistro=el7.9 serverdistro=el7.9
Test-Parameters: testlist=sanity env=ONLY=103-154 clientdistro=el8.6
Test-Parameters: testlist=sanity env=ONLY=103-154,SANITY_EXCEPT=130,HONOR_EXCEPT=y clientdistro=el9.0
Test-Parameters: testlist=sanity env=ONLY=103-154 clientdistro=sles15sp3
Test-Parameters: testlist=sanity env=ONLY=103-154 clientdistro=sles15sp4
Test-Parameters: testlist=sanity env=ONLY=103-154 clientdistro=ubuntu2004

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I7003e95577ab3a9314e8d4d29bb6b1784b9f8ae7
Reviewed-on: https://review.whamcloud.com/45627
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-14441 mdc: check/grab import before access 81/41681/19
Alex Zhuravlev [Mon, 13 Dec 2021 08:27:42 +0000 (11:27 +0300)]
LU-14441 mdc: check/grab import before access

to ensure the import doesn't disappear while being accessed
via procfs.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I005c96b349e55646996fd0d265ab4dd1e2b9a1fa
Reviewed-on: https://review.whamcloud.com/41681
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-15829 llite: don't use a kms if it invalid. 95/47395/6
Alexey Lyashkov [Thu, 19 May 2022 17:35:18 +0000 (20:35 +0300)]
LU-15829 llite: don't use a kms if it invalid.

Lockless DIO don't update a KMS as other IO type does,
it caused a situation when next read don't known a real file size
to be read. Lets avoid using an invalid KMS.

Fixes: 6bce5367 (LU-4198 clio: turn on lockless for some kind of IO)
Signed-off-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Change-Id: Ie71d3f3cc24fc16c03ed07f9f5a3a17c7fdfa684
Reviewed-on: https://review.whamcloud.com/47395
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-10391 ptlrpc: lprocfs_exp_setup() to take struct lnet_nid 42/44642/5
Mr NeilBrown [Thu, 8 Jul 2021 01:32:48 +0000 (11:32 +1000)]
LU-10391 ptlrpc: lprocfs_exp_setup() to take struct lnet_nid

lprocfs_exp_setup() now takes 'struct lnet_nid *' as peer_nid.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: If779893d8b1c7b650d39182c121c1f611d058f0d
Reviewed-on: https://review.whamcloud.com/44642
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-10391 ptlrpc: pass lnet_nid for self to ptl_send_buf() 41/44641/5
Mr NeilBrown [Thu, 8 Jul 2021 00:53:30 +0000 (10:53 +1000)]
LU-10391 ptlrpc: pass lnet_nid for self to ptl_send_buf()

The 'self' arg to ptl_send_buf() is now a pointer to a
'struct lnet_nid', and can be NULL meaning "ANY NID".

LNetPut() already accepts NULL as the self pointer.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I859dfa10e2f5e50c029c6926fe25ac036fb4f494
Reviewed-on: https://review.whamcloud.com/44641
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-10391 ptlrpc: change bd_sender in ptlrpc_bulk_frag_ops 40/44640/5
Mr NeilBrown [Tue, 18 Jan 2022 18:12:50 +0000 (13:12 -0500)]
LU-10391 ptlrpc: change bd_sender in ptlrpc_bulk_frag_ops

bd_sender in struct ptlrpc_bulk_frag_ops is now 'struct lnet_nid'.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I43a6600dcc814a6a46b3a793641545123efaa6ab
Reviewed-on: https://review.whamcloud.com/44640
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-10391 ptlrpc: change rq_source to struct lnet_nid 39/44639/5
Mr NeilBrown [Sat, 20 Aug 2022 17:30:25 +0000 (13:30 -0400)]
LU-10391 ptlrpc: change rq_source to struct lnet_nid

rq_source in struct ptlrpc_request can now store large NIDs.
ptl_send_buf() now takes a struct lnet_processid for the peer.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I2fe7da2332955c69f6252d44fb3ae28d2ef4e517
Reviewed-on: https://review.whamcloud.com/44639
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-10391 ptlrpc: change rq_peer to struct lnet_nid 38/44638/4
Mr NeilBrown [Thu, 4 Aug 2022 01:43:26 +0000 (21:43 -0400)]
LU-10391 ptlrpc: change rq_peer to struct lnet_nid

rq_peer in struct ptlrpc_request can now store large NIDs.
ptlrpc_connection_get() and others now take a
struct lnet_processid

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I3bb419720434714301946d278413ce6090aa2cdd
Reviewed-on: https://review.whamcloud.com/44638
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-10391 ptlrpc: pass net num to ptlrpc_uuid_to_connection 37/44637/4
Mr NeilBrown [Thu, 8 Jul 2021 00:34:36 +0000 (10:34 +1000)]
LU-10391 ptlrpc: pass net num to ptlrpc_uuid_to_connection

Rather than passing a nid to indicate which net to choose,
pass just the net number.  This will make it easier to convert to
'struct lnet_nid'.

Also change ptlrpc_uuid_to_peer() to take the refnet as an explicit
argument, rather than embedding in in the peer pid.

This makes the refnet test more obvious, and removes the (strange)
need to test the address part against zero.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I0650760a59342f5ac245cc14011452e436ef8e4c
Reviewed-on: https://review.whamcloud.com/44637
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-10391 ptlrpc: change rq_self to struct lnet_nid 36/44636/4
Mr NeilBrown [Wed, 7 Jul 2021 05:55:06 +0000 (15:55 +1000)]
LU-10391 ptlrpc: change rq_self to struct lnet_nid

rq_self in struct ptlrpc_request can now store largs NIDs.
ptlrpc_connection_get() is also changed to received a
'struct lnet_nid'.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: If2ea7770e967e2f044f2b2300950b612463e130c
Reviewed-on: https://review.whamcloud.com/44636
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-8367 tests: cleanup_orphans hang reproducer 39/46939/7
Alexander Boyko [Thu, 12 May 2022 13:49:34 +0000 (09:49 -0400)]
LU-8367 tests: cleanup_orphans hang reproducer

The patch adds recovery-small 144 test to reproduce hang at
osp_precreate_cleanup_orphans().

PID: 49938  TASK: ffff98c63a248000  CPU: 30  COMMAND: "osp-pre-3-1"
__schedule at ffffffff8e54e1d4
schedule at ffffffff8e54e648
osp_precreate_cleanup_orphans at ffffffffc17d00e9 [osp]
osp_precreate_thread at ffffffffc17d18da [osp]

Test-Parameters: trivial testlist=recovery-small env=ONLY=144b
HPE-bug-id: LUS-10793
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: I463c75e63043c71ed0de0c6d08294098099c67e5
Reviewed-on: https://review.whamcloud.com/46939
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-16075 kernel: kernel update RHEL8.6 [4.18.0-372.19.1.el8_6] 16/48116/5
Jian Yu [Tue, 23 Aug 2022 01:37:06 +0000 (18:37 -0700)]
LU-16075 kernel: kernel update RHEL8.6 [4.18.0-372.19.1.el8_6]

Update RHEL8.6 kernel to 4.18.0-372.19.1.el8_6.

Test-Parameters: trivial fstype=ldiskfs \
clientdistro=el8.6 serverdistro=el8.6 testlist=sanity

Test-Parameters: trivial fstype=zfs \
clientdistro=el8.6 serverdistro=el8.6 testlist=sanity

Change-Id: I8e0fbdab54d36512c4c4cbdbc97c580994ebcbd3
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/48116
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-16090 build: Module.symvers lookup by flavor on SUSE 95/48195/2
Shaun Tancheff [Thu, 11 Aug 2022 11:48:40 +0000 (18:48 +0700)]
LU-16090 build: Module.symvers lookup by flavor on SUSE

When multiple kernel flavors are found we need to select only
the Module.symvers for the flavor that is being built.

HPE-bug-id: LUS-11149
Test-Parameters: trivial
Fixes: 1f4aaefe1aae ("LU-15962 build: add in-kernel Module.symvers to symbol path")
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I1c9af91108534d3a67f816077756fded4cd0b653
Reviewed-on: https://review.whamcloud.com/48195
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Tested-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-16085 tests: fix sanityn test_106c 35/48435/2
Sebastien Buisson [Tue, 6 Sep 2022 06:57:04 +0000 (08:57 +0200)]
LU-16085 tests: fix sanityn test_106c

Fix sanityn test_106c after modification introduced when fixing
stat attributes_mask.

Test-Parameters: trivial testlist=sanityn env=ONLY=106c
Fixes: 0e48653c27 ("LU-16085 llite: fix stat attributes_mask")
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I370813b9b1c22450577c390964a0cc410735b989
Reviewed-on: https://review.whamcloud.com/48435
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-16100 tests: fix sanity/51d divide-by-zero 93/48393/2
Andreas Dilger [Tue, 30 Aug 2022 19:14:16 +0000 (13:14 -0600)]
LU-16100 tests: fix sanity/51d divide-by-zero

Fix dirstripe count when testing on non-DNE configs.

Test-Parameters: trivial
Fixes: cf35c54224b3 ("LU-14745 tests: ensure sanity/51d has enough objects")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I787df4cfda9e62673e5f89d2b899154f636777fe
Reviewed-on: https://review.whamcloud.com/48393
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Feng, Lei <flei@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-9859 libcfs: remove Lustre specific bitmap handling 22/48222/3
James Simmons [Tue, 16 Aug 2022 13:46:36 +0000 (09:46 -0400)]
LU-9859 libcfs: remove Lustre specific bitmap handling

Only the NRS TBF handling uses the Lustre specific bitmap
handling. Convert to the Linux bitmap API and remove the
Lustre specific bitmap handling.

Test-Parameters: trivial testlist=sanityn env=ONLY=77
Change-Id: I58dcf869778d6cf6349c16e73d75e53735ffb97d
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/48222
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-16085 llite: fix stat attributes_mask 08/48208/3
Sebastien Buisson [Fri, 12 Aug 2022 07:59:02 +0000 (09:59 +0200)]
LU-16085 llite: fix stat attributes_mask

Fix stat attributes_mask to return STATX_ATTR_ENCRYPTED whenever it is
possible. Also fix sanityn test_106c to expect at least the 0x30 flag
for attributes_mask.

Test-Parameters: trivial
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Icd16beff058c42d77e9b04ad1a287ec2ac04dfed
Reviewed-on: https://review.whamcloud.com/48208
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-16093 kernel: kernel update SLES12 SP5 [4.12.14-122.130.1] 04/48204/2
Jian Yu [Fri, 12 Aug 2022 01:41:35 +0000 (18:41 -0700)]
LU-16093 kernel: kernel update SLES12 SP5 [4.12.14-122.130.1]

Update SLES12 SP5 kernel to 4.12.14-122.130.1 for Lustre client.

Test-Parameters: trivial clientdistro=sles12sp5 \
env=SANITY_EXCEPT="56oc 430c 817" testlist=sanity

Change-Id: Ib2180a056889d481a7b55c41cbcd98c8e0e272d8
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/48204
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Colin Faber <cfaber@ddn.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>