git://git.whamcloud.com - fs/lustre-release.git/log

EX-5975 build: check OS type before using dpkg

Bright cluster manager by default installs dpkg
on it's centos/rhel installation - presumably to
allow provisioning debian nodes in the cluster,
so dpkg is in the path and can't be removed.

This patch fixes LB_USES_DPKG to check OS type
before checking if dpkg is installed.

Test-Parameters: trivial clientdistro=el8.6
Test-Parameters: trivial clientdistro=ubuntu2204 env=SANITY_EXCEPT="130 244a"

Change-Id: Idc9f6edc91f9c89b40f259421b088287e08bfe9c
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/48616
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Gaurang Tapase <gtapase@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

LU-16090 build: Module.symvers lookup by flavor on SUSE

When multiple kernel flavors are found we need to select only
the Module.symvers for the flavor that is being built.

Lustre-change: https://review.whamcloud.com/48195
Lustre-commit: f3a9921ae4f9c3e48328f2c682e0c7e61221e0d3

HPE-bug-id: LUS-11149
Test-Parameters: trivial
Fixes: 1f4aaefe1aae ("LU-15962 build: add in-kernel Module.symvers to symbol path")
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I1c9af91108534d3a67f816077756fded4cd0b653
Reviewed-on: https://review.whamcloud.com/48329
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

LU-16059 build: Installation of dkms server builds

The linux-zfs-dkms package is passing the wrong paths
for zfs [and spl] causing the dkms build to fail.

ZFS_VERSION is not parsed correctly from 'dkms status'.

The splver and zfsver check can match against the wrong
package(s).

lustre-zfs-dkms provides: kmod-lustre-osd-zfs, and
lustre-osd-zfs-mount
lustre-ldiskfs-dkms provides: kmod-lustre-osd-ldiskfs and
lustre-osd-ldiskfs-mount

In the case of multiple zfs versions installed, build lustre
osd against the highest version number.

Lustre-change: https://review.whamcloud.com/48083
Lustre-commit: c3dc67b2c5bf1974d792b3701d932bd04c756bd8

HPE-bug-id: LUS-11113
Test-Parameters: trivial
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Ic154ca045427bf26cb7e6a44b8c467675e987aad
Reviewed-on: https://review.whamcloud.com/48594
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

LU-16089 kernel: kernel update RHEL 7.9 [3.10.0-1160.76.1.el7]

Update RHEL 7.9 kernel to 3.10.0-1160.76.1.el7.

Lustre-change: https://review.whamcloud.com/48202
Lustre-commit: 94955bbc6dc82b43fd77150b82834132bc56f565

Test-Parameters: trivial clientdistro=el7.9 serverdistro=el7.9

Change-Id: I97d087a5d5bb27996a5c0caf382c011928c651b4
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/48277
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

LU-16000 utils: align updatelog parameters in llog_reader

Parameters in update log records are aligned on 64bits. llog_reader
do not aligned these parameters: if a parameters size is not mutiple
of 8, the next parameter size will be read incorrectly.

Lustre-change: https://review.whamcloud.com/47913
Lustre-commit: 6d74b759634355e7f6647ccaefef519a1ff208e2

Test-Parameters: trivial
Fixes: 9962d6f ("LU-14617 utils: llog_reader updatelog support")
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Signed-off-by: Etienne AUJAMES <etienne.aujames@cea.fr>
Change-Id: I6871614ab4ea79d59c3c3b4644b377de395bad56
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/48551
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>

LU-15724 tests: MDT failover hang reproducer

The patch adds recovery-small 144a test to reproduce
MDT failover hang when precreate threads are blocked on objects.

LustreError: 0-0: Forced cleanup waiting for mdt-kjcf05-MDT0001_UUID
namespace with 46 resources in use, (rc=-110)

Lustre-change: https://review.whamcloud.com/47006
Lustre-commit: aa6250b7412e7baf6760fe4010a81f4f22187127

Test-Parameters: trivial testlist=recovery-small env=ONLY=144a
HPE-bug-id: LUS-10750
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: I2743a1b5c8911d6982b527f7e7b7bbbaf310cd04
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-on: https://review.whamcloud.com/48550
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

LU-15724 osp: wakeup all precreate threads

Number of threads could sleep at osp_precreate_reserve() and
wait objects from OST. When MDT stops Lustre should wakeup
all threads. When opd_pre_recovering is set any wakeup of
opd_pre_user_waitq is useless. Failover of MDT does not produce
disconnect event, only inactive, so osp_precreate_cleanup_orphans()
can not be awakened.

LustreError: 0-0: Forced cleanup waiting for mdt-kjcf05-MDT0001_UUID
namespace with 46 resources in use, (rc=-110)

schedule_timeout at ffffffff8e551cd3
osp_precreate_reserve at ffffffffc17d2d83 [osp]
osp_declare_create at ffffffffc17c7eb9 [osp]
lod_sub_declare_create at ffffffffc156415b [lod]
lod_qos_declare_object_on at ffffffffc155bf42 [lod]
lod_ost_alloc_rr.constprop.23 at ffffffffc155db2f [lod]
lod_qos_prep_create at ffffffffc15630a6 [lod]
lod_declare_instantiate_components at ffffffffc154b237 [lod]

Lustre-change: https://review.whamcloud.com/47005
Lustre-commit: e55fc043679cdfadfff6874ef78e2e0128ec37ac

HPE-bug-id: LUS-10750
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: If0164cfbecb1e358d9857421cb234559dc8cecbc
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-on: https://review.whamcloud.com/48546
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

LU-15555 ldiskfs: large directory causes htree corruption

When creating a lot of files in a single directory, it can
get corrupted because of a typo in ext4-kill-dx-root.patch.

Lustre-change: https://review.whamcloud.com/46526
Lustre-commit: ea3ee9337f9bcd42360e4523f1e34bcd04d3bf41

Change-Id: Ia36278580741e1eb905e24a3a6231ba7daaa882a
Fixes: 20a6d32 ("LU-12637 kernel: RHEL 8.1 server support")
HPE-bug-id: LUS-10730
Signed-off-by: Andrew Perepechko <c17827@cray.com>
Signed-off-by: Alexander Zarochentsev <c17826@cray.com>
Signed-off-by: Artem Blagodarenko <artem.blagodarenko@hpe.com>
Reviewed-on: https://review.whamcloud.com/48545
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

EX-5380 lipe: wait longer before restarting the access log reader

In lamigo_alr_data_collection_thread() if the access log reader exits
with status zero then it means that no OSTs are mounted on the
host. In this case we should wait longer before restarting the access
log reader.

Lustre-change: https://review.whamcloud.com/47627
Lustre-commit: 27c05f8cb39a8bf8d9e9386841fc7ecd700cf0fb

Test-Parameters: trivial testlist=hot-pools
Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: I282c6b8e251c432664bc3b4eb202351a5bd7fe5b
Reviewed-by: Alexandre Ioffe <aioffe@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/48380
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Colin Faber <cfaber@ddn.com>

LU-14305 ldiskfs: add parameters for mb_c123_threshold

Add mount options for /sys/fs/ldiskfs/*/mb_c[123]_threshold values
so that they can be set persistently via mount options.

The /sys/fs/ldiskfs/*/mb_c[123]_threshold values are always shown
rounded down to the next lower percentage value due to integer
division, since internal values are stored as blocks for efficiency.

Round up the values shown to the next percent to match what was
used to originally set these parameters.

Lustre-change: https://review.whamcloud.com/41193
Lustre-commit: c2fd5297b46c4973aeda4d4d02cbc7ca2faa0d50

Fixes: 95f8ae567749 ("LU-12103 ldiskfs: don't search large block range if disk full")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Artem Blagodarenko <ablagodarenko@whamcloud.com>
Change-Id: Ie36a6667f8bca7481aa8179ab5b97c85d449d619
Reviewed-by: Artem Blagodarenko <artem.blagodarenko@hpe.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/41955
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/48499

LU-15003 sec: use enc pool for bounce pages

Take pages from the enc pool so that they can be used for
encryption, instead of letting llcrypt allocate a bounce page
for every call to the encryption primitives.
Pages are taken from the enc pool a whole array at a time.

This requires modifying the llcrypt API, so that new functions
llcrypt_encrypt_page() and llcrypt_decrypt_page() are exported.
These functions take a destination page parameter.
Until this change is pushed in upstream fscrypt, this performance
optimization is not available when Lustre is built and run against
the in-kernel fscrypt lib.

Using enc pool for bounce pages is a worthwhile performance win. Here
are performance penalties incurred by encryption, without this patch,
and with this patch:

                     ||=====================|=====================||
                     || Performance penalty | Performance penalty ||
                     ||    without patch    |     with patch      ||
||==========================================|=====================||
|| Bandwidth – write |        30%-35%       |   5%-10% large IOs  ||
||                   |                      |    15% small IOs    ||
||------------------------------------------|---------------------||
|| Bandwidth – read  |         20%          |    less than 10%    ||
||------------------------------------------|---------------------||
||      Metadata     |         N/A          |         5%          ||
|| creat,stat,remove |                      |                     ||
||==========================================|=====================||

Lustre-change: https://review.whamcloud.com/47149
Lustre-commit: f3fe144b8572e9e75bb55076e29057227476ebf5

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
Change-Id: I3078d0a3349b3d24acc5e61ab53ac434b5f9d0e3
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/47513
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>

LU-14719 osp: add inode watermark

* move block watermark from debugfs to sysfs.
* add inode watermark for OSP.

Lustre-change: https://review.whamcloud.com/47128
Lustre-commit: 336eb696299e1c9731bd1443f05e5d814314ed36

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I7c768fa2ebfb4b8c2f75255f9e9c061d4c15cf66
Reviewed-on: https://review.whamcloud.com/47866
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

LU-16161 kernel: kernel update RHEL8.6 [4.18.0-372.26.1.el8_6]

Update RHEL8.6 kernel to 4.18.0-372.26.1.el8_6.

Lustre-change: https://review.whamcloud.com/48564
Lustre-commit: TBD (from 66b1b4469d6e5e65b450702c6cb68ec14a51e9b0)

Test-Parameters: trivial fstype=ldiskfs \
clientdistro=el8.6 serverdistro=el8.6 testlist=sanity

Test-Parameters: trivial fstype=zfs \
clientdistro=el8.6 serverdistro=el8.6 testlist=sanity

Change-Id: I45bf6dbff5061407e1109732b6d466d0f7a8376c
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/48575
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

EX-4359 build: add bio-integrity patch to rhel8 series

Add bio-integrity-unbound-concurrency patch to the rhel8.5 and
rhel8.6 series to ensure balanced T10-PI core usage.

Test-Parameters: trivial serverdistro=el8.5 clientdistro=el8.5 testlist=sanity,conf-sanity
Test-Parameters: trivial serverdistro=el8.6 clientdistro=el8.6 testlist=sanity,conf-sanity

Fixes: 97fba9aa48ca ("DDN-2042 bio: allow BIO integrity to run on any core")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I31f9ced4eadad105466556183e2b9e9e0419164d
Reviewed-on: https://review.whamcloud.com/47848
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>

LU-15795 lbuild: enable KABI

Enable build kabi and clean up kmodtool patch

Lustre-change: https://review.whamcloud.com/47507
Lustre-commit: TBD (from 03fc87a2ba08e5c4b8b8787f19b4e736d2752fae)

Test-Parameters: trivial fstype=ldiskfs clientdistro=el8.5 serverdistro=el8.5
Test-Parameters: trivial fstype=ldiskfs clientdistro=el8.6 serverdistro=el8.6

Change-Id: I16d54af0004c4ddc1cc5e6acca81e4aa89a1a1c1
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/48486
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

LU-14642 flr: allow layout version update from client/MDS

Client write/punch request always carries its layout version so
that OFD can reject the request if the carried layout version
is a stale one.

This patch allows MDS as well as client to update new layout version
to OST objects. And during resync write, all OST objects will get
layout version updated.

Lustre-change: https://review.whamcloud.com/45443
Lustre-commit: fa6574150b6f745a668fe69b2d6d970068

Fixes: 7d97777a5d ("LU-14642 flr: abolish MDS transfer layout version to OST")
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I9f27af354875d48adda3361f6c8ea5a5f6def73b
Reviewed-on: https://review.whamcloud.com/47097
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

LU-9699 osp: don't assert on OSP duplicating

Writeconf on an MDT with index > 0000 will cause
"add mdc" to be added to $FSNAME-client config
and "add osp" to be added to $FSNAME-MDTXXXX configs.

However, the configs may already contain these
directives. Duplicating the OSP device will
cause the assertion failure in osp_obd_connect():
ASSERTION( osp->opd_connects == 1 ) failed

Duplicating the MDC just returns -EEXIST in similar
situation.

A possible solution is to check configs for duplicates
before writing to them. However, sometimes we
would like to change nids which are part of
"add mdc" and "add osp".

Another solution is to mark previous entries with
SKIP flags. This patch implements this approach.
Since after revoking the config lock, the clients
and the MDTs will receive the updated log and
apply its newer entries, we still have to handle
OSP duplication, but this is only an issue
immediately after writeconf processing.

Lustre-change: https://review.whamcloud.com/27753
Lustre-commit: 98f107b53e4daa3bfaf026c379c0a9c41cb5f161

Seagate-bug-id: MRP-2634, MRP-3865
Change-Id: Idd7ad43c78d50e6bbe715850503aa0b01fcbf071
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/48515
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>

LU-15262 osd: bio_integrity_prep_fn return value processing

There is osd_bio_integrity_handle() fn in lustre/osd-ldiskfs/osd_io.c
It checks the returned code of bio_integrity_prep_fn() but between
mainstream Linux 4.12 and 4.13 kernel integrity API has changed and
in 4.13+ (as well as for any RHEL8 including first beta)

bio_integrity_prep() returns boolean true on success.

Lustre-change: https://review.whamcloud.com/45646
Lustre-commit: 41c813d14ec9b353f9cf5ac82638996dcb5273d7

HPe-bug-id: LUS-10443
Signed-off-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Change-Id: I973aa8ccae024157ad863d26afc7b1264a5c7149
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <artem.blagodarenko@hpe.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-on: https://review.whamcloud.com/48582
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>

RM-620 build: New tag 2.14.0-ddn60

New tag 2.14.0-ddn60

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ib500a2a5f4677f496380750ff0ca3eee7eff1b57

LU-15860 socklnd: Duplicate ksock_conn_cb

If two threads enter ksocknal_add_peer(), the first one to acquire
the ksnd_global_lock will create a ksock_peer_ni and associate a
ksock_conn_cb with it.

When the second thread acquires the ksnd_global_lock it will find the
existing ksock_peer_ni, but it does not check for an existing
ksock_conn_cb. As a result, it overwrites the existing ksock_conn_cb
(ksock_peer_ni::ksnp_conn_cb) and the ksock_conn_cb from the first
thread becomes stranded.

Modify ksocknal_add_peer() to check whether the peer_ni has an
existing ksock_conn_cb associated with it

Lustre-change: https://review.whamcloud.com/47361
Lustre-commit: 0c91d49a44e1214b5c65d4a557f6969b3d217881

Fixes: 7766f01e89 ("LU-13641 socklnd: replace route construct")
HPE-bug-id: LUS-10956
Test-Parameters: trivial
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I6c0190a0c1d3321ddd85c763b86ad1f0d32cf2b9
Reviewed-on: https://review.whamcloud.com/48259
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

LU-15234 lnet: Race on discovery queue

If the discovery thread clears the LNET_PEER_DISCOVERING bit then a
race window opens when the discovery thread drops the
lnet_peer.lp_lock spinlock and closes when the discovery thread
acquires the lnet_net_lock. If another thread queues the peer for
discovery during this window then the LNET_PEER_DISCOVERING bit is
added back to the peer state, but since the peer is already on the
lnet.ln_dc_working queue, it does not get added to the
lnet.ln_dc_request queue.

When the discovery thread acquires the lnet_net_lock/EX, it sees that
the LNET_PEER_DISCOVERING bit has not been cleared, so it does not
call lnet_peer_discovery_complete() which is responsible for sending
messages on the peer's discovery pending queue.

At this point, the peer is stuck on the lnet.ln_dc_working queue, and
messages may continue to accumulate on the peer's
lnet_peer.lp_dc_pendq.

Fix the issue by re-working the main discovery thread loop so that we
do not release the lnet_peer.lp_lock until after we've determined
whether we need to call lnet_peer_discovery_complete().
This ensures that the lnet_peer is correctly removed from the
discovery work queue and any messages on the peer's
lnet_peer.lp_dc_pendq are sent or finalized.

It is also possible for the lnet_peer.lp_dc_error to be cleared
during the aforementioned window, as well as during the time when
lnet_peer_discovery_complete() is processing the contents of the
lnet_peer.lp_dc_pendq. This could prevent messages on the
lnet_peer.lp_dc_pendq from being correctly finalized. To fix this
issue, the responsibilities of lnet_peer_discovery_error() were
incorporated into lnet_peer_discovery_complete().

Lustre-change: https://review.whamcloud.com/45670
Lustre-commit: 852a4b264a984979dcef1fbd4685cab1350010ca

Test-Parameters: trivial testlist=sanity-lnet
HPE-bug-id: LUS-10615
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I3779a342de7108105c2fd2bc41373560e8e5ef14
Reviewed-on: https://review.whamcloud.com/48313
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Alexandre Ioffe <aioffe@ddn.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

LU-14941 lnet: Fix source specified to routed destination

If a source NI is specified for a send then we should not modify the
destination NID that was passed to lnet_send().

Lustre-change: https://review.whamcloud.com/44730
Lustre-commit: 98da4ace43a6c4c59e7981bf0fb649005237d12f

Test-Parameters: trivial testlist=sanity-lnet
HPE-bug-id: LUS-10301
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Ie47558d5bce97a0dca30ff7d072dcd39eb903324
Reviewed-on: https://review.whamcloud.com/48441
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Alexandre Ioffe <aioffe@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

LU-14940 lnet: Fix source specified send to different net

The destination NI is fixed for all source-specified sends. Thus, in
order for a source-specified send to be considered "local", i.e. a
send that does not require a route, the destination NID must be on
the same net as the specified source.

Lustre-change: https://review.whamcloud.com/44728
Lustre-commit: 3e3563f719ce89de28d276f3de1add064932506b

HPE-bug-id: LUS-10303
Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I4847db1d393bbc36def65123f260b928ebbf944e
Reviewed-on: https://review.whamcloud.com/48440
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Alexandre Ioffe <aioffe@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

LU-14660 lnet: Fix destination NID for discovery PUSH

If we're sending a discovery PUSH after receiving a discovery
REPLY then we want to send via the same NID that the reply was
sent to. This introduces a challenge in selecting an appropriate
destination NID for the PUSH because lnet_select_pathway() will not
run the MR selection algorithm for choosing a peer NI if the source
NI has been specified.

It is reasonable to assume that the NID used by the message
originator in sending the REPLY is a suitable destination for the
discovery PUSH. Thus, we record this NID in the same location we
currently record the lp_disc_src_nid, and use it when sending the
PUSH. With this change, the only other user of lnet_peer_select_nid()
is lnet_peer_send_ping(). In the ping case we do not set a source NID,
so lnet_select_pathway() is free to choose any peer NI. So this change
allows us to get rid of lnet_peer_select_nid() altogether.

Alternatively, we would need to reproduce a lot of the path selection
algorithm inside lnet_peer_select_nid() in order to avoid sending to
unhealthy NIDs. It seems undesirable and unnecessary to duplicate that
logic.

Lustre-change: https://review.whamcloud.com/43507
Lustre-commit: dce2f7d1987711dfdced903b13e67091cffe9628

Test-Parameters: trivial
HPE-bug-id: LUS-9333
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I47ef856075f049d71c395565974204b8f6fa9003
Reviewed-on: https://review.whamcloud.com/48439
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Alexandre Ioffe <aioffe@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

LU-13950 lnet: do not crash if lnet_sock_getaddr returns error

Some issues with network lead to panic in ksocknal_accept

rc = lnet_sock_getaddr(sock, true, &peer_ip, &peer_port);
LASSERT(rc == 0); /* we succeeded before */

Let's pass this error to the caller.

Lustre-change: https://review.whamcloud.com/39834
Lustre-commit: 48a9ea82eb30bbbf66cce527c1205d13fbd4eb58

Test-Parameters: trivial testlist=sanity-lnet
Change-Id: I34d43c19b4e75422db50e7abb02cac3510882b0d
hpe-bug-id: LUS-9256
Signed-off-by: Artem Blagodarenko <artem.blagodarenko@hpe.com>
Reviewed-on: https://review.whamcloud.com/48443
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

LU-14206 lnet: Router ping timeout with discovery disabled

Discovery pings are used to determine the health of gateways and
associated routes. Ping replies from gateways with dynamic discovery
(DD) disabled (or if DD is disabled locally) are handled in
a special routine, lnet_router_discovery_ping_reply(), but this
function and related code doesn't handle the case where a discovery
ping hits the response tracker timeout and is unlinked by the
monitor thread. In this case, an UNLINK event is generated and we
do not call the lnet_router_discovery_ping_reply(). For gateways
with DD enabled (and DD enabled locally), we handle this case
in lnet_router_discovery_complete(). If discovery failed then
lp_dc_error is set and we mark all routes down for the gateway. We
can simply extend this logic to the case of gateways w/DD disabled
(or DD disabled locally).

Lustre-change: https://review.whamcloud.com/40923
Lustre-commit: 173d86c6e9a704a84de36ae57a337a3fdae7b1ed

Test-Parameters: trivial
Fixes: 9f337d94e7 ("LU-13029 lnet: fix asym routing with multi-hop")
HPE-bug-id: LUS-9612
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I009c69d4f8990b72d83d9426c782c0e55c1023a4
Reviewed-on: https://review.whamcloud.com/48382
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Alexandre Ioffe <aioffe@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

LU-15275 lnet: Skip router discovery on send path

When the router checker is enabled, routes are regularly marked as out
of date w.r.t. discovery. This can cause upper level messages to be
delayed while the router undergoes discovery. We can avoid delaying
messages by relying on the router checker to initiate discovery of
routers. If we happen to send a message to a router before it has
been discovered then the worst case scenario is that the route is
actually down or we end up utilizing a subset of a multi-rail router's
interfaces. Both situations can be remedied by utilizing the
check_routers_before_use parameter.

Change the logic in lnet_handle_find_routed_path() so that we only
initiate discovery if the alive_router_check_interval is <= 0 (i.e.
router checker pings are disabled).

Lustre-change: https://review.whamcloud.com/45684
Lustre-commit: c8e74c395d5634dbb0d9d8a86605bb36ab2b8233

Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: If0332c21f6157117598b7b908fe17f2d2690fc1d
Reviewed-on: https://review.whamcloud.com/48383
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Alexandre Ioffe <aioffe@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

LU-13781 lnet: Local NI must be on same net as next-hop

When sending to a remote peer we need to restrict our selection of a
local NI to those on the same peer net as the next-hop.

The code currently selects a local NI on the peer net specified by the
lr_lnet field of the lnet_route returned by lnet_find_route_locked().
However, lnet_find_route_locked() may select a next-hop peer NI on any
local peer net - not just lr_lnet.

A redundant assignment to sd->sd_msg->msg_src_nid_param is also
removed. That variable is always set appropriately in
lnet_select_pathway().

Lustre-change: https://review.whamcloud.com/39352
Lustre-commit: 031c087f3847777c0099cbfae13f0b6fee54452b

Test-Parameters: trivial
HPE-bug-id: LUS-9095
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: If1bec26d6646b9e66b99656d7db2dc538d631a34
Reviewed-on: https://review.whamcloud.com/48381
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Alexandre Ioffe <aioffe@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

LU-13714 lnet: only update gateway NI status on discovery

Move the NI status from DOWN to UP only when receiving
a discovery PING. The discovery PING should be the only
message which should update the NI status since it's used
as the gateway NI keep alive mechanism.

This is done to avoid the following scenario:

The gateway itself can push its updates to the peers which
have removed it from its routing table. The peers would
respond to the PUSH with an ACK, the ACK will bring the
gateway's NI status to up. Therefore other peers which have
avoid_asym_router_failure=1 will have their route status
remain up even though the symmetrical route is gone.

Note: there is no way for the gateway to differentiate between
a keep alive discovery and a manually triggered discovery or ping.
However, this a narrow case which will not be handled.

net_last_alive converted to use ktime_get_seconds() instead of
ktime_get_real_seconds() since the NTP adjustment is not needed.

Lustre-change: https://review.whamcloud.com/39176
Lustre-commit: 3e3f70eb1ec95f32d9a97795d7fdf02cca82b5a0

Test-Parameters: trivial
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Ifd5b06d4cf783b68b36413ada63f0a1d0095fb5b
Reviewed-on: https://review.whamcloud.com/48379
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Alexandre Ioffe <aioffe@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

LU-15039 lnet: Fix reference leak in lnet_parse

We need to drop the reference taken by lnet_nid2peerni_locked() if we
determine that we need to drop the message because of asymmetric
route.

Lustre-change: https://review.whamcloud.com/45067
Lustre-commit: e69eca08bce47bf85b3c011598e360a2468019b5

Test-Parameters: trivial
HPE-bug-id: LUS-9186
Fixes: 955080c3ae ("LU-13779 lnet: Correct asymmetric route detection")
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I799c9522b1ce5f4caffc5848a829995e5b5484e7
Reviewed-on: https://review.whamcloud.com/48378
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Alexandre Ioffe <aioffe@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

LU-14945 lnet: don't use hops to determine the route state

NodeA <-tcp1-> GW1 <-tcp2-> GW2 <-tcp3-> NodeB

Assuming GW1 knows how to reach tcp3 network and GW2 knows
how to reach tcp1 network, it should be possible to add routes
without specifying hop=2 on nodes A and B to reach tcp3 and tcp1
respectively and then be able to lnetctl ping between them.
Changes introduced by LU-13785 interpret default hops to be
equivalent to hop=1 set explicitly for the purpose of determining
route aliveness, which results in the routes created as described
above to be considered "down".

Fix it so that default hop setting doesn't prevent
the multi-hop scenario from working.

Lustre-change: https://review.whamcloud.com/44674
Lustre-commit: 3f2844dc9333c86452c37bd7b4519729b1351371

Test-Parameters: trivial
Fixes: 2e07619477 ("LU-13785 lnet: Use lr_hops for avoid_asym_router_failure")
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: I341ccdfe156434b0cb306359acc91a9193b44f7b
Reviewed-on: https://review.whamcloud.com/48337
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

LU-13780 lnet: Leverage peer aliveness more efficiently

When an LNet router is revived after going down, remote peers may
discover it is alive before we do. Thus, remote peers may use it
as a next-hop, and we may start receiving messages from it while we
still consider it to be dead. We should mark router peers as alive
when we receive a message from them.

If an LNet router does not respond to a discovery ping, then we
currently mark all of its NIs as DOWN. This can actually slow down
the process of returning a route to service. If we receive a message
from a router, in the manner described above, then we can safely
return the router to service. We already set the status of the router
NI we received the message from to UP, but the remote NIs will still
be DOWN and thus the route will be considered down until we get a
reply to the next discovery ping.

When selecting a route, we only consider the aliveness of a gateway's
remote NIs if avoid_asym_router_failure is enabled and the route is
single-hop. In this case, as long as the gateway has at least one
alive NI on the remote network then the route is considered UP. In
the situation described above, we know the router has at least one
NI alive because it was used to forward a message from a remote peer.
Thus, when we receive a forwarded message from a router, we can
reasonably set the NI status of all of its NIs that are on the same
peer net as the message originator to UP. This does not impact the
route status of any multi-hop routes because we do not consider the
aliveness of remote NIs for multi-hop routes.

Similarly, we can set the cached lr_alive value to up for any routes
whose lr_net matches the net ID of the message originator NID. This
variable is converted to an atomic_t to get rid of the need for
global locking when updating it.

Lustre-change: https://review.whamcloud.com/39350
Lustre-commit: 886e34ce56c491e8844cf892f32b08807cdf2bff:

Test-Parameters: trivial
HPE-bug-id: LUS-9088
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I0170762d78d80e4b70724799cd1ee1301118f25c
Reviewed-on: https://review.whamcloud.com/48335
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>

LU-13785 lnet: Use lr_hops for avoid_asym_router_failure

In order for the asymmetric route failure avoidance feature to work
properly it needs to know what the hop count of a route should be.
This information is defined by the lr_hops field of the lnet_route.
The lr_single_hop is what discovery was able to determine the hop
count actually is (single or multi) based on the last ping reply.
If a remote interface on a router goes missing, the route may be
classified as multi-hop by discovery, but it should be considered
single-hop for the purposes of avoiding asymmetric route failure.

Lustre-change: https://review.whamcloud.com/39362
Lustre-commit: 2e07619477684f287a2399ccdbbde0a71289574b

HPE-bug-id: LUS-9099
Test-Parameters: trivial
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I9c255f9a2175d964661850277808dae96ff7735c
Reviewed-on: https://review.whamcloud.com/48336
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

LU-13779 lnet: Correct asymmetric route detection

Failure to lookup the remote net for LNET_NIDNET(src_nid) indicates an
asymmetric route, but we do not drop the message in this case. Another
problem with this code is that there is no guarantee that we'll have a
route->lr_lnet that matches the net of ni->ni_nid.

We can move the asymmetric route detection to after we have looked up
the lpni of from_nid. Then, we can look at just the routes associated
with the gateway that owns the lpni. If one of those routes has
lr_net == LNET_NIDNET(src_nid), then the route is symmetrical.

Lustre-change: https://review.whamcloud.com/39349
Lustre-commit: 955080c3ae3f33c98e068f52a096761ea28624b7

Fixes: 4932febc12 ("LU-11894 lnet: check for asymmetrical route messages")
Test-Parameters: trivial
HPE-bug-id: LUS-9087
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I8044d3f53e6f000c1e4d7c4e34b3b21afe0f9711
Reviewed-on: https://review.whamcloud.com/48334
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

LU-13708 lnet: lnet_notify sets route aliveness incorrectly

lnet_notify() modifies route aliveness in two ways:
1. By setting lp_alive field of the lnet_peer struct.
2. By setting lr_alive field of the lnet_route struct (via call to
lnet_set_route_aliveness())

In both cases, the aliveness value assigned is determined by a call
to lnet_is_peer_ni_alive(), but that value only reflects the aliveness
of a particular peer NI. A gateway may have multiple peer NIs, so the
aliveness of a gateway peer (lp_alive) is not necessarily equivalent
to the aliveness of one of its NIs. Furthermore, the lr_alive field
is only used to determine route aliveness for path selection if
discovery is disabled locally or on the gateway (see
lnet_find_route_locked() and lnet_is_route_alive()).

In general, we should not set lp_alive based on an lnet_notify()
call, and we should only set lr_alive if discovery is disabled. For
lr_alive specifically, we should only set it for those routes that
have the peer NI as a next-hop.

An exception to the above exists when the reset argument to
lnet_notify() is set. The gnilnd uses this flag in its calls to
lnet_notify() because gnilnd receives out-of-band notifications of
node up and down events. Thus, when gnilnd calls lnet_notify() we
actually know whether the gateway peer is up or down and we can set
lp_alive appropriately.

net lock/EX is held by other callers of lnet_set_route_aliveness, so
we do the same in lnet_notify().

Lustre-change: https://review.whamcloud.com/39160
Lustre-commit: e24471a722a6f23fb0051b4511f3fee2662d0e4e

Fixes: e35be987da ("LU-12422 lnet: discovery off route state update")
Fixes: ebc9835a97 ("LU-12941 lnet: Add peer level aliveness information")
Test-Parameters: trivial
HPE-bug-id: LUS-9034
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I2927e5f5ef849e45c233c92d2a6deca765e496eb
Reviewed-on: https://review.whamcloud.com/48290
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

LU-16012 sec: fix detection of SELinux enforcement

On newer distros (e.g. RHEL 9.0), on which selinux_is_enabled() does
not exist anymore, the only way to find out if SELinux is enforced
when initializing the security context is to fetch the length of the
security attribute name. If it is 0, we conclude SELinux is disabled.

Lustre-change: https://review.whamcloud.com/48049
Lustre-commit: 155cbc22ba4f758cf9eec415f36f940ca2b23de9

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Ifcdcb8ffbb7f9ad50d16d7d3317e94d0d212fa42
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/48422
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>

EX-5815 lipe: do not print in lpcc signal handler

Do not print in lpcc signal handler.
It's invalid in python script.

Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: trivial testlist=sanity-pcc env=ONLY=210
Change-Id: I61eb80ff1d59453dc12855fd2f1ac4f1e6e40757
Reviewed-on: https://review.whamcloud.com/48449
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

EX-3442 tests: use wait_file_resync in hot-pools test 15

This patch replaces "$LFS mirror resync" with
"wait_file_resync" in hot-pools test 15 to avoid
racing with lamigo's "$LFS mirror resync".

Test-Parameters: trivial testlist=hot-pools,hot-pools

Change-Id: I48ffb7d6a33b664359f227d1f693369feffa70b6
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/47233
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

EX-2010 scsi: requeue aborted commands for el8

If the underlying SCSI command returns an abort, rather than retry
it quickly in a loop, which can finish within a few milliseconds,
requeue it with delay so that the hardware has a chance to recover.

The command requeue will take several seconds each time and allows
more chance for the problem to be resolved at the SCSI layer instead
of returning an error to the filesystem and causing server failover.

This patch is no longer required with SFAOS 11.8.3 and later, as SFAOS
will change ABORT to busy (SFAP-71972). Patch can be removed once we
are certain of SFA version and/or have removed other kernel patches.

Test-Parameters: trivial clientdistro=el8.6 serverdistro=el8.6
Test-Parameters: trivial clientdistro=el8.5 serverdistro=el8.5

Signed-off-by: Trung Nguyen <trunguyen@ddn.com>
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ibdf1b3a52dd0a1b388c7f5f97aa7a516203ebbe5
Reviewed-on: https://review.whamcloud.com/48340
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>

LU-16138 kernel: preserve RHEL8.x server kABI for block integrity

Currently there are two kernel patches supporting SCSI T10-PI feature
left in the RHEL8.x series:

- block-integrity-allow-optional-integrity-functions-rhel8.patch
- block-pass-bio-into-integrity_processing_fn-rhel8.patch

The changes in the patches modified "struct bio_integrity_payload"
and "struct blk_integrity_iter", which caused kABI breakage.

This patch fixes the patches to preserve kABI by using
RH-supplied compatibility macros.

Test-Parameters: trivial fstype=ldiskfs clientdistro=el8.5 serverdistro=el8.5
Test-Parameters: trivial fstype=ldiskfs clientdistro=el8.6 serverdistro=el8.6

Change-Id: If547e1cd4ae4ff1affd315bbfefaeeff4f1dea81
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/48445
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

LU-16075 kernel: kernel update RHEL8.6 [4.18.0-372.19.1.el8_6]

Update RHEL8.6 kernel to 4.18.0-372.19.1.el8_6.

Lustre-change: https://review.whamcloud.com/48116
Lustre-commit: TBD (077f4b13e7fbe564a79c35487e8208e8381fc833)

Test-Parameters: trivial fstype=ldiskfs \
clientdistro=el8.6 serverdistro=el8.6 testlist=sanity

Test-Parameters: trivial fstype=zfs \
clientdistro=el8.6 serverdistro=el8.6 testlist=sanity

Change-Id: I8e0fbdab54d36512c4c4cbdbc97c580994ebcbd3
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/48319
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

LU-16082 ldiskfs: old-style EA inode handling fix

The upstream version of EA inodes support coming
with RHEL8 (linux kernel 4.18+) have a slightly different
implementation of EA inodes support and also have a
compatibility code to recognize old-style Lustre-only EAs.
Unfortunately the compatibility code is broken and makes
old xattr data unaccessible due to a wrong hash value check.

Lustre-change: https://review.whamcloud.com/48174
Lustre-commit: 76c3fa96dc30f21e95d80f9119972d7358975258

HPE-bug-id: LUS-11133
Signed-off-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Change-Id: Icd6f93d4ebb33dcd03b58f9eb364905c18ae81dc
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/48413
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>

LU-14719 utils: dir migration stop on error

Once directory migration fails, it should stop immediately since
current migration won't succceed, and subsequent migration may
fail on the same error.

Lustre-change: https://review.whamcloud.com/47040/
Lustre-commit: 9ca348e8769d2c613082eeaeaf2775e22625e970

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I96c1693d1b1da0856c925b9b22c1ab7f3181f0d8
Reviewed-on: https://review.whamcloud.com/47868
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

LU-15694 quota: keep grace time while setting default limits

The quota grace time should only be changed by "lfs setquota -t",
and it should be kept while setting default quota limits.

This patch also fixes an issue of not saving the grace time while
writing glboal quota record.

Lustre-change: https://review.whamcloud.com/46935
Lustre-commit: d4978678b49102226a79a6c8e5d10075d416977d

Signed-off-by: Hongchao Zhag <hongchao@whamcloud.com>
Change-Id: I89ca49d09dc41deffe4bc77e53721b5bb4f4be37
Reviewed-on: https://review.whamcloud.com/48416
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

LU-14472 tests: modify version_code in sanity test

Modify version_code in sanity-quota.sh and sanity-sec.sh according to
DDN version. There is no need to modify version_code on other branch.

The version_code in test_59c() and test_49 of in sanity-sec.sh was
changed to 2.14.0.50 according to Sébastien's suggestion.

Signed-off-by: Xing Huang <hxing@ddn.com>
Change-Id: Ie448cbf60f6bdacdbba39ab0a1a86c6953d51ecb
Reviewed-on: https://review.whamcloud.com/48282
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

EX-5014 pcc: minor fixes for parameter checks

Improve console message when out-of-range pcc_dio_attach_size_mb
values are supplied.

Fix sanity-pcc test_49b to allow future limit changes.

Test-Parameters: trivial testlist=sanity-pcc
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I2bf7d0bf564c954318980f7a09d8713a70f37db9
Reviewed-on: https://review.whamcloud.com/48438
Reviewed-by: Qian Yingjin <qian@ddn.com>

EX-5014 pcc: avoid deadlock during DIO open attach on rhel7

The Maloo testing fails with sanity-pcc/45 due to the following
deadlock on rhel7 kernel:

ll_fid_path_cop D ffff9a32db5eb180 0 10783 10782 0x00000080
Call Trace:
schedule_preempt_disabled+0x29/0x70
__mutex_lock_slowpath+0xc7/0x1d0
mutex_lock+0x1f/0x2f
lookup_slow+0x33/0xa7
link_path_walk+0x80f/0x8b0
path_openat+0xae/0x5a0
do_filp_open+0x4d/0xb0
do_sys_open+0x124/0x220
SyS_open+0x1e/0x20

dd D ffff9a32fb5b6300 0 10779 10755 0x00000080
Call Trace:
wait_for_completion+0xfd/0x140
call_usermodehelper_exec+0x179/0x1a0
call_usermodehelper+0x40/0x60
pcc_copy_data_dio+0x267/0x340 [lustre]
pcc_attach_data_archive+0x6ff/0xe80 [lustre]
pcc_readonly_attach+0x3d2/0xad0 [lustre]
pcc_readonly_attach_sync+0x205/0x260 [lustre]
pcc_file_open+0x798/0xdd0 [lustre]
ll_atomic_open+0xd80/0x1780 [lustre]
do_last+0xa53/0x1340
path_openat+0xcd/0x5a0
do_filp_open+0x4d/0xb0
do_sys_open+0x124/0x220
SyS_open+0x1e/0x20

This bug only happened on el7 kernel which uses mutex for inode
locking.
During ->ll_atomic_open(), the kernel will take this mutex on the
parent inode. However, when copy data via the user space helper
program ll_fid_path_copy, it will also try to obtain this mutex
lock on the parent inode during lookup, resulting in deadlock.

Test-Parameters: clientdistro=el7.9 testlist=sanity-pcc
Test-Parameters: clientdistro=el8.5 mdscount=2 mdtcount=4 testlist=sanity-pcc env=ONLY=45,ONLY_REPEAT=10
Change-Id: I384c7b1979d93183b86bbde311d29a50346a8d56
Signed-off-by: Qian Yingjin <qian@ddn.com>
Reviewed-on: https://review.whamcloud.com/48405
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>

EX-5014 pcc: Add dio support for data copy during attach

PCC attach performance is bottlenecked by single threaded
buffered I/O performance.  We could do multi-threading, but
multi-threaded buffered I/O to one file has a very low
performance ceiling.  In order to significantly speed up
PCC attach performance, we need to switch to DIO.

DIO cannot be done from kernel memory due to various
restrictions, so we call out to a usermode helper.

Note that the helper uses open by fid because given a
file pointer, it's not possible to reliably generate the
path to a file on Lustre due to container namespace issues.
Specifically, the path used by the user may not work for
our helper program due to namespace differences.  So we
must use open by fid for the Lustre side of the copy.

This patch improves attach performance from about 1 GiB/s
to about 5 GiB/s.  This performance figure includes time to
read the data from Lustre *and* to write it out to PCC.

Temporarily disable sanity-pcc/45 until the deadlock problem is
fixed.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Idb2a12296c3e4778763c9b576bbb0ecd2570a458
Reviewed-on: https://review.whamcloud.com/47158
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

EX-5014 pcc: Readability cleanups

It's really hard to remember what 'inode' and 'file' mean
when there's more than one in a function, so I've redone
some of the names here.

Test-parameters: trivial

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ia00d0bda216a26f285f0fda8bc8edd3c51d66ce4
Reviewed-on: https://review.whamcloud.com/47157
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

RM-620 build: New tag 2.14.0-ddn59

New tag 2.14.0-ddn59

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I36134a0a38fd0f6778bfdf533d0eac2b0f662121

LU-15811 llite: Refactor DIO/AIO free code

Refactor the DIO/AIO free code and add some asserts.

This removes a potential use-after-free in the freeing
code.

Lustre-change: https://review.whamcloud.com/48115/
Lustre-commit: 0358bd41174176cbfc9d6786bffb6dc95b68adcf (tbd)

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I335b18fc7a28fc426a25675e2449d3d192cba596
Reviewed-on: https://review.whamcloud.com/48103
Reviewed-by: Yingjin Qian <qian@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>

LU-15811 llite: Unify range unlock

Correct parallel_dio condition and unify range unlock code
block.

Lustre-change: https://review.whamcloud.com/48000/
Lustre-commit: 84064c8e8112aed2e49d2dcd6b4f1c6a21770261 (tbd)

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ib66e8def571054df5117c279e238894bc3b58bce
Reviewed-on: https://review.whamcloud.com/47999
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

LU-15811 llite: clarify 'nofree' usage

The 'nofree' value is confusing, and was backwards in master.
It's correct here in ES6, but this patch clarifies status a bit.
(No master equivalent, this was rolled in to:
https://review.whamcloud.com/47187 on master)

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I2dbd0c68250da17e982f04a566a5d77bd56796ef
Reviewed-on: https://review.whamcloud.com/47954
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

RM-620 build: New tag 2.14.0-ddn58

New tag 2.14.0-ddn58

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ica9b3ae537093d173e06790fea1b2c664e842a57

LU-13991 ldlm: speedup flock reprocess

We can check for deadlock only for first
conflicting lock, the rest deadlock checks
will be performed after cancelation of
first conflicting lock.

Lustre-change: https://review.whamcloud.com/40048
Lustre-commit: dadec10251090ba88c1b39517943e6603ba6d682

Change-Id: I18359db405ab021a4f32ac833de203254097142d
HPE-bug-id: LUS-8509
Signed-off-by: Andriy Skulysh <c17819@cray.com>
Reviewed-on: https://review.whamcloud.com/48320
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

LU-15402 ldlm: speedup RD flock enqueue

Scanning of lr_granted can be done until
covering granted RD lock is reached.

Lustre-change: https://review.whamcloud.com/45957
Lustre-commit: b07a57027ee5cc1afa82cc4c82be73a2c4894502

Change-Id: I907cff002d9765c5f8496d377eddd5e62795d89c
HPE-bug-id: LUS-10623
Signed-off-by: Andriy Skulysh <c17819@cray.com>
Reviewed-on: https://review.whamcloud.com/48323
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

LU-13929 lnet: modify assertion in lnet_post_send_locked

Check that the pointer to the local interface is not NULL
before asserting. While checking if local ni is the destination,
the assertion may attempt to dereference pointer to local
interface after it has already been cleaned up on shutdown.

Lustre-change: https://review.whamcloud.com/40749
Lustre-commit: e5a8f3fc12840aee97fca03d76b1ae9b4572acb8

Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: I0f4be04a728a7243823bec70f9efbe52bcb104b3
Reviewed-on: https://review.whamcloud.com/48265
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

LU-15446 lnet: Don't use pref NI for reserved portal

Don't use the preferred NI when sending traffic on the LNet reserved
portal. This allows local recovery pings to utilize any local NI as
source in the case where we do not have a multi-rail peer entry for
the local host. This is typically the case when MR is not being
configured statically (i.e. when discovery is being used for MR
configuration).

lnet_get_best_ni() was modified to include health values of the NIs
being compared in its debug output.

Lustre-change: https://review.whamcloud.com/46078
Lustre-commit: a2815441381cb6cee8eb9865d9279541ea04828e

HPE-bug-id: LUS-10658
Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I38f5760bf034f698b7f44ffa89aa91c4f5d4b9ea
Reviewed-on: https://review.whamcloud.com/48312
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

LU-14661 lnet: Check if discovery toggled off in ping reply

If a peer is initially discovered and found to have discovery
enabled, but the peer later reloads LNet with discovery disabled,
then we can delete the peer and re-create it the next time the peer
is discovered.

It is safe to delete and re-create the peer as long as it wasn't
configured manually.

In lnet_peer_deletion(), we need to use lnet_del_init() when removing
the peer from the discovery queue because the lnet_peer_del() code
path can result in a call to lnet_peer_queue_for_discovery() where
we check if the lp_dc_list is empty.

Lustre-change: https://review.whamcloud.com/43508
Lustre-commit: 143893381d428466d4c71e075a041a9cbbd28818

Test-Parameters: trivial
HPE-bug-id: LUS-9178
Fixes: aa7de0af69 ("LU-13895 lnet: Prevent discovery on peer marked deletion")
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I0b43d7541711a3b94c492082d4a29487ebe72b09
Reviewed-on: https://review.whamcloud.com/48296
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

LU-15512 lnet: Stop discovery on deleted peer NI

lnet_discover_peer_locked() needs to check whether the peer NI that is
undergoing discovery has been deleted (i.e. its assocaited peer has
LNET_PEER_MARK_DELETED state). Otherwise, we may enter an infinite
loop because this peer will never be considered up to date.

Lustre-change: https://review.whamcloud.com/46429
Lustre-commit: 94f4e1f517d71ffd6662fb4a82e3dee9aa8f6796

Test-Parameters: trivial testlist=sanity-lnet
Fixes: fd32cd817c ("LU-13895 lnet: Prevent discovery on deleted peer")
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I43d276fc460241c1724c8e30913bb6c5cbb7c8f4
Reviewed-on: https://review.whamcloud.com/48295
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

LU-13883 lnet: Lookup lpni after discovery

The lpni for a nid can change as part of the discovery process (see
lnet_peer_add_nid()). As such, callers of lnet_discover_peer_locked()
need to lookup the lpni again after discovery completes to make sure
they get the correct peer.

An exception is lnet_check_routers() which doesn't do anything with
the peer or peer NI after the call to lnet_discover_peer_locked().
If the router list is changed then lnet_check_routers() will already
repeat discovery.

Lustre-change: https://review.whamcloud.com/39747
Lustre-commit: 584d9e46053234d02a3290822317552785e44e76

HPE-bug-id: LUS-9167
Test-Parameters: trivial
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I8bdfcb957e87f65ce65bfad81858a4ce3362298e
Reviewed-on: https://review.whamcloud.com/48294
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

LU-13894 lnet: Transfer disc src NID when merging peers

If we're merging two peers in lnet_peer_data_present() then we need
to transfer the src NID stored in the peer whose ping buffer we are
processing to the peer that actually owns the NIDs in the ping
buffer. Otherwise it is possible that the subsequent push to the peer
that is being discovered will go out over an interface that the peer
does not know about and it will be dropped.

Lustre-change: https://review.whamcloud.com/39607
Lustre-commit: e65d8ba583858ae10f2d53fd270b19d13e423634

Test-Parameters: trivial
HPE-bug-id: LUS-9193
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I050c7c1c2c0eddb8d5ff12f40342a8a02efacb9c
Reviewed-on: https://review.whamcloud.com/48293
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

LU-13895 lnet: Prevent discovery on deleted peer

We needn't perform any discovery activities on a peer that has had
lnet_peer_del() called on it.

Lustre-change: https://review.whamcloud.com/39605
Lustre-commit: fd32cd817cba336c684fe3ab7aac79705061e8b5

Test-Parameters: trivial testlist=sanity-lnet
HPE-bug-id: LUS-9192
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I5c89dc89038d2c8bf4d2a29029af7720963b81a2
Reviewed-on: https://review.whamcloud.com/48292
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

LU-13895 lnet: Prevent discovery on peer marked deletion

If a peer has been marked for deletion then we needn't perform any
other discovery operation on it. Integrate this peer state into the
top level of the discovery state machine so that it is checked before
any other state.

Lustre-change: https://review.whamcloud.com/39604
Lustre-commit: aa7de0af6969df77a896e3a2e90c971a5081e324

HPE-bug-id: LUS-9192
Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Ie9de5b0d38d720f4f49d7e4a0673a6b52f9d3d80
Reviewed-on: https://review.whamcloud.com/48291
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

LU-14939 lnet: Allow specifying a source NID for lnetctl ping

Add a new --source option for lnetctl ping command. This allows the
user to specify a local NI from which to send the ping. This also
ensures that the specified destination NID is also used. Otherwise,
pings to multi-rail peers may end up going to a different peer NI
based on the multi-rail selection algorithm. The ability to specify
a source NI, and thus fix the destination NI, is a great help in
troubleshooting communication issues between multi-rail peers.

Add test to exercise lnetctl ping --source option.

Lustre-change: https://review.whamcloud.com/44727
Lustre-commit: 48ef9982c474a02c460293bce17c9e45f9829eab

HPE-bug-id: LUS-10296
Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I454217b30a92414de537880f076a11a693b1f0b3
Reviewed-on: https://review.whamcloud.com/48297
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

EX-4697 lipe: Define statistics fields for lpurge / lamigo

Added JSON stats output in lamigo
Extended JSON stats output in lpurge

Signed-off-by: Alexandre Ioffe <aioffe@ddn.com>
Test-Parameters: trivial testlist=hot-pools
Change-Id: Ib367022dd073c1699d75e3ea7cfa3b586e7b8877
Reviewed-on: https://review.whamcloud.com/48125
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Colin Faber <cfaber@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

EX-5505 lipe: JSON statistics crashes lpurge

Use json_object_get() before json_object_put()
otherwise json_object_put() call causes crash

Signed-off-by: Alexandre Ioffe <aioffe@ddn.com>
Test-Parameters: trivial testlist=hot-pools
Change-Id: Id5e8f05dd010f6626835176bf854344cd2b58a93
Reviewed-on: https://review.whamcloud.com/47885
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

LU-13813 tests: fix stack_trap in conf-sanity test 110/111

This patch fixes stack_trap in conf-sanity test 110 and 111
to restore test environment.

Lustre-change: https://review.whamcloud.com/48022
Lustre-commit: 0109cee2610b8dfeaaca25c3eb1e805e033c593d

Test-Parameters: trivial env=SLOW=yes,ENABLE_QUOTA=yes clientdistro=el8.5 serverdistro=el8.5 testlist=conf-sanity
Test-Parameters: env=SLOW=yes,ENABLE_QUOTA=yes fstype=zfs clientdistro=el8.5 serverdistro=el8.5 testlist=conf-sanity
Test-Parameters: env=SLOW=yes,ENABLE_QUOTA=yes mdscount=2 mdtcount=4 clientdistro=el8.5 serverdistro=el8.5 testlist=conf-sanity
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Change-Id: I540d96e8ad2c4990e7da18fe22256b44e9a19c72
Reviewed-on: https://review.whamcloud.com/48023
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

LU-15018 o2iblnd: treat cmid->device == NULL as an error

Even if rdma_bind_addr is successful, kiblnd_dev_failover should
treat cmid->device == NULL as an error in order to later avoid
calling kiblnd_set_ni_fatal_on with possibly dev->ibd_hdev == NULL.

Lustre-change: https://review.whamcloud.com/44981
Lustre-commit: abd0ce62e96523193bfc2e2a3f574bc59d6c9f7c

Test-Parameters: trivial testlist=sanity-lnet
Fixes: 4668283cd1 ("LU-14806 o2iblnd: clear fatal error on successful failover")
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: Iefbe030b25d2dc543461cf98afeacd734fd64cf8
Reviewed-on: https://review.whamcloud.com/48258
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

EX-5014 pcc: Limit attach queue depth

The existing async attach code does not attempt to limit
the number of async attaches that can be requested at once.
This is a problem because we could theoretically create too
many kthreads and overwhelm the system.

When the attach queue depth is exceeded, we stop allowing
new items to be queued by switching over to sync attach.

Ideally we would rebuild the attach code to generate a
queue of attach requests and have the attach thread code
pull items from the queue until it's exhausted, but that's
a much more substantial change and is left for later.

NB: This patch is incomplete - there's no way to adjust the
queue depth at runtime and there's no test for it. Both
need to be added.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ib00dfb67f5245a28b722278d031ee8cdf5e190d6
Reviewed-on: https://review.whamcloud.com/47061
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

EX-5014 pcc: Change PCC commands to use constants

PCC command names are just written out as strings, making
them hard to track. Change them all to use named commands.

This also includes a few minor debug and structural changes
as part of prep for the main patch against EX-5014.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Icad8dfdb44ed2562a95b2aaa0432cba221e4a1bc
Reviewed-on: https://review.whamcloud.com/46894
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

LU-15874 kernel: new kernel [RHEL 9.0 5.14.0-70.22.1.el9_0]

This patch makes changes to support new RHEL 9.0 release
for Lustre client.

fix lbuild to include modified find-requires.ksyms

Lustre-change: https://review.whamcloud.com/47847
Lustre-commit: bbe5e9818053e43ebf97e2d3fa240917bfbd8336

Test-Parameters: trivial clientdistro=el9.0 \
env=SANITY_EXCEPT="101j 130 244a" testlist=sanity

Test-Parameters: trivial clientdistro=el9.0 \
env=LIPE_FIND_VERBOSE=true testlist=sanity-lipe

Test-Parameters: clientdistro=el9.0 testlist=sanity-pcc
Test-Parameters: clientdistro=el8.6 testlist=sanity-pcc

Change-Id: Ib7fdf9d3946df626759d395b5000b375391da344
Co-Authored-By: Minh Diep <mdiep@whamcloud.com>
Co-Authored-By: Alex Deiter <alex.deiter@gmail.com>
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Signed-off-by: Alex Deiter <alex.deiter@gmail.com>
Reviewed-on: https://review.whamcloud.com/47880
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

LU-15959 kernel: new kernel [SLES15 SP4 5.14.21-150400.24.18.1]

This patch makes changes to support new SLES15 SP4 release
with kernel 5.14.21-150400.24.18.1 for Lustre client.

Lustre-change: https://review.whamcloud.com/47696
Lustre-commit: TBD (from 4bf090b81119d02ed5baa59c6857e2c88a746736)

Test-Parameters: trivial clientdistro=sles15sp4 \
env=SANITY_EXCEPT="27J 101j 244a" testlist=sanity
Test-Parameters: trivial clientdistro=sles15sp3

Change-Id: I0bf548835578163767d2f6a2a5e5bd2b33154871
Co-Authored-By: Minh Diep <mdiep@whamcloud.com>
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/47905
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

LU-16060 osd-ldiskfs: copy nul byte terminator in writelink

memcpy() call in osd_ldiskfs_writelink() doesn't copy the nul
terminator byte from the source buffer, leaving the space
after target link name uninialized which is ok for the kernel
code and debugfs but not e2fsck.

Lustre-change: https://review.whamcloud.com/48092
Lustre-commit: 907dc0a2d333f2df2d654a968fc50f8cc05b779d

Signed-off-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
HPE-bug-id: LUS-11103
Change-Id: I914f2c78e1a6571bf360a23b0ede8c70502bf0df
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/48301
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>

LU-16037 build: remove quotes from %{mkconf_options}

This patch fixes lustre-dkms.spec.in to remove quotes
from %{mkconf_options} passed to dkms.mkconf, so as to
resolve the following build issue:

dkms.conf: Error! Directive 'DEST_MODULE_LOCATION'
does not begin with '/kernel', '/updates', or '/extra'
in record #0.

Lustre-change: https://review.whamcloud.com/48044
Lustre-commit: 33efefc496159f7d0caed0fa85d8f92603060ae2

Test-Parameters: trivial

Change-Id: I0b365d7a96cb632680bc2321e87b28a3bf076e47
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/48118
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Colin Faber <cfaber@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

LU-10378 utils: add formatted printf to lfs find

Introduce new --printf option with lfs find utility along with
support for the backslash escapes and format directives given
below that allow users to obtain metadata in formatted style.

List of backslash escapes supported by --printf option:
-------------------------------------
   Description               | Escape
-------------------------------------
   Newline character         | \n
   Tab character             | \t
   Literal backslash         | \\

List of format directives used with --printf option:
----------------------------------------------------------
   Description                                  | Directive
----------------------------------------------------------
   Literal % character                          | %%
   Access time (in ctime format)                | %a
   Access time (in secs since epoch)            | %A@
   File size (in 512B blocks)                   | %b
   Last change time (in ctime format)           | %c
   Last change time (in secs since epoch)       | %C@
   Numeric group ID of file/dir owner           | %G
   File size (in 1K blocks)                     | %k
   File mode (octal)                            | %m
   Path name of file                            | %p
   File size (in bytes)                         | %s
   Modification time (in ctime format)          | %t
   Modification time (in secs since epoch)      | %T@
   Numeric user ID of file/dir owner            | %U
   Birth time (in ctime format)                 | %w
   Birth time (in secs since epoch)             | %W@
   File type                                    | %y
   Stripe count                                 | %Lc
   Lustre FID                                   | %LF
   Directory hash type                          | %Lh
   Starting OST (file) or MDT (dir) index       | %Li
   List of all OST (file) or MDT (dir) indices  | %Lo
   OST pool name                                | %Lp
   Numeric project id assigned to file/dir      | %LP
   Stripe size in bytes                         | %LS
---------------------------------------------------------
Note: Stripe size and OST pool name are not defined for
directories whereas Hash type is not defined for files.
%Li gives starting OST index for files and starting MDT index
for directories. For composite files %Lo provides list of all
OST indices for all components whereas %Lc, %LS, %Li and %Lp
provide details for last initialized component only.

A usage example for --printf option and its output for a composite
file with three components are shown below.

   lfs find --printf '%a | %t | %c | %w | %W@ | %b | %s | %U | %G |
   %A@ | %T@ | %C@ | %LP | %Lc | %LS | %Li | %Lo | %Lp | %pn'
   /lustre/lustre/composite.txt

   Tue Oct 26 16:06:18 2021 | Tue Oct 26 16:06:50 2021 | Tue Oct 26
   16:06:50 2021 | Tue Oct 26 16:06:18 2021 | 1635278778 | 204800 |
   104857600 | 0 | 0 | 1635278778 | 1635278810 | 1635278810 | 0 | 3 |
   2097152 | 2 | [1][2,0][2,0,1] | pool1 |
   /lustre/lustre/composite.txt

Lustre-change: https://review.whamcloud.com/45136
Lustre-commit: 6b8e97b76c472068e7d6bc792e4f202b2f70ca67

Change-Id: I370c0978900a4837b0ea3060e08dabb1fcb6e115
Signed-off-by: Anjus George <georgea@ornl.gov>
Signed-off-by: Rick Mohr <mohrrf@ornl.gov>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/48252
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>

LU-14179 lfs: avoid lfs find error with long paths

Test that files created in a directory having an absolute path length
of up to PATH_MAX-1 are properly found with lfs find. This change
might not cover other very deep directory tree (above PATH_MAX).

Lustre-change: https://review.whamcloud.com/41337
Lustre-commit: a6a76df19db61a2015f4cc78f88060f249c955f2

Signed-off-by: Stephane Thiell <sthiell@stanford.edu>
Change-Id: I44726efd5053c593094587e5c8a4652a3a876641
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/48272
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>

LU-5170 utils: add lfs df -H for decimal units

Running "lfs df -ih" prints a base-two suffix for inode counts,
which is somewhat unintuitive (e.g. 100000 becomes 97.2K inodes).
While this is consistent with upstream "df", it also has a "-H"
option to print the output with decimal suffixes.

Add the -H/--si option to "lfs df" also.

Document the 'f' (flash) and 'N' (noprecreate) flags for "lfs df".

Lustre-change: https://review.whamcloud.com/41271
Lustre-commit: 7b720df1fbd4136cd1ab8f3fefefd3971b2f7031

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I06b8df4ae2940107720e57013bf187b3473ebbe5
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/48267
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>

LU-15548 tests: skip conf-sanity/131 for older servers

Skip conf-sanity.sh test_131 when running against older servers that
do not support the trusted.projid xattr.

Lustre-change: https://review.whamcloud.com/48151
Lustre-commit: 5fefbd10786f3f8705d2251071d8778d0de2835d

Test-Parameters: trivial testlist=conf-sanity env=ONLY=131
Test-Parameters: testlist=conf-sanity env=ONLY=131 serverversion=2.14.0
Fixes: e4d07f2c30 ("LU-12056 ldiskfs: add trusted.projid virtual xattr")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: If1858502ab50ffd10e494eab793e3bc0f883fe9e
Reviewed-on: https://review.whamcloud.com/48264
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>

LU-15076 socklnd: lock ksnc_tx_queue list processing

A GFP occurred in the ksocknal_find_timed_out_conn() while processing
ksnc_tx_queue list.

Add locking to this list.

Lustre-change: https://review.whamcloud.com/45179
Lustre-commit: 13c7c2e3c248c8cdba4853852bfaecceb7a75afe

Test-Parameters: trivial testlist=sanity-lnet
Change-Id: I1f76683e5798c5015f11e3fa285db9613b1af906
Signed-off-by: Artem Blagodarenko <artem.blagodarenko@hpe.com>
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
HPE-bug-id: LUS-10248
Fixes: 25c1cb2c4d ("LU-9120 lnet: handle socklnd tx failure")
Reviewed-on: https://review.whamcloud.com/48256
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

EX-5644 utils: fix lpcc service dependency on lipe-lpcc

This patch makes changes to support python3 for lipe:
* The long() function is no longer supported by Python 3.
  It only has one built-in integral type, named int().
* Octal literals are no longer of the form 0720 - use
  0o720 instead.
* The print statement has been replaced with a print()
  function, with keyword arguments to replace most of the
  special syntax of the old print statement (PEP 3105).
* The dict.iterkeys(), dict.iteritems() and dict.itervalues()
  methods are no longer supported - use dict.items() instead.
* The StringIO and StringIO modules are gone. Instead,
  import the io module and use io.StringIO or io.BytesIO
  for text and data respectively.
* The builtin basestring abstract type was removed.
  Use str instead.
* Use string.ascii_lowercase instead of string.lowercase.

Test-Parameters: trivial testlist=sanity-lipe env=SANITY_EXCEPT="101j 130 244a"
Test-Parameters: clientdistro=el7.9 testlist=sanity-pcc
Test-Parameters: clientdistro=el8.6 testlist=sanity-pcc
Change-Id: Ia5a3b6490fd4cebbd40327d5b2a431590c82cf00
Signed-off-by: Alex Deiter <alex.deiter@gmail.com>
Reviewed-on: https://review.whamcloud.com/48149
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Feng, Lei <flei@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

LU-16056 libcfs: restore umask handling in kernel threads

This reverts commit 9013eb2bb5 which incorrectly assumes that Lustre
service threads do not modify umask. A quick grep shows that umask
is modified in osd-ldiskfs __osd_create().

If some other thread sharing the same fs context is modifying umask
in an incompatible way (which includes all Lustre threads after
this patch) then it will occasionally break created file access
permissions for Lustre.

Lustre-change: https://review.whamcloud.com/48233
Lustre-commit: TBD (from e88334d806687ad2512323f1e4c2667348f02a4e)

Fixes: 9013eb2bb5 (LU-9859 libcfs: don't call unshare_fs_struct()")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I589b72e4286dc84f4e3f1a0c54fe31aa988e6c18
Reviewed-on: https://review.whamcloud.com/48236
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Shuichi Ihara <sihara@ddn.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>

LU-16019 llite: fully disable readahead in kernel I/O path

In the new kernel (rhel9 or ubuntu 2204), the readahead path may
be out of the control of Lustre CLIO engine:

generic_file_read_iter()
  ->filemap_read()
    ->filemap_get_pages()
      ->page_cache_sync_readahead()
        ->page_cache_sync_ra()

void page_cache_sync_ra()
{
if (!ractl->ra->ra_pages || blk_cgroup_congested()) {
if (!ractl->file)
return;
req_count = 1;
do_forced_ra = true;
}

/* be dumb */
if (do_forced_ra) {
force_page_cache_ra(ractl, req_count);
return;
}
...
}

From the kernel readahead code, even if read-ahead is disabled
(via @ra_pages == 0), it still issues this request as read-ahead
as we will need it to satisfy the requested range. The forced
read-ahead will do the right thing and limit the read to just
the requested range, which we will set to 1 page for this case.

Thus it can not totally avoid the read-ahead in the kernel I/O
path only by setting @ra_pages with 0.
To fully disable the read-ahead in the Linux kernel I/O path, we
still need to set @io_pages to 0, it will set I/O range to 0 in
@force_page_cache_ra():
void force_page_cache_ra()
{
...
max_pages = = max_t(unsigned long, bdi->io_pages,
    ra->ra_pages);
nr_to_read = min_t(unsigned long, nr_to_read, max_pages);
while (nr_to_read) {
...
}
...
}

After set bdi->io_pages with 0, it can pass the sanity/101j.

Lustre-change: https://review.whamcloud.com/47993
Lustre-commit: f0cf7fd3cccb2313fa94a307cf862afba256b8d8

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I859a6404abb9116d9acfa03de91e61d3536d3554
Reviewed-on: https://review.whamcloud.com/48078
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

LU-15896 gss: support OpenSSLv3

Lustre GSS code makes use of some OpenSSL API that has been
deprecated in v3, namely all the functions in the DH_* family.
So replace them with their EVP_PKEY_* counterparts if Lustre is
built on a system with OpenSSLv3.

Lustre-change: https://review.whamcloud.com/47717
Lustre-commit: 615691a531a80b75c4dd054dbb86d0bdbf4cf808

Fixes: ee60c14360 ("LU-15896 gss: ignore OpenSSLv3 deprecated API")
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I78a4ca18b25aca3c34fe84e41413a33caddc01b6
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/48185
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>

LU-12678 o2iblnd: fix bug in list_first_entry() change.

This comparison should be != NULL, else a NULL pointer could be
dereferenced.

Lustre-change: https://review.whamcloud.com/43558
Lustre-commit: 0024460d797490ae90a2221cb5d4648c9d4fac82

Test-Parameters: trivial
Fixes: 34b57a6f8fcd ("LU-12678 lnet: use list_first_entry() in lnet/klnds subdirectory.")
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I4510e2e0f2eb7b5bf86626e5ddb5ee537d3fae02
Reviewed-on: https://review.whamcloud.com/48245
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>

LU-12678 lnet: use list_first_entry() in lnet/klnds subdir

Convert
list_entry(foo->next .....)
to
list_first_entry(foo, ....)

in 'lnet/klnds

In several cases the call is combined with a list_empty() test and
list_first_entry_or_null() is used

Lustre-change: https://review.whamcloud.com/43419
Lustre-commit: 34b57a6f8fcd1bc57c0ba92e299bd39f3baa6cb5

Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Mr. NeilBrown <neilb@suse.de>
Change-Id: I3b2b33c3c9284c02e44610614d64a1f84be300a4
Reviewed-on: https://review.whamcloud.com/48244
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

LU-15983 lnet: Define KFILND network type

Define the KFILND network type. This reserves the network type number
for future implementation and allows creation of kfi peers and
adding routes to kfi peers.

Lustre-commit: 5fea36c952373c9a235be7bf57eb2e516fcb36b2
Lustre-change: https://review.whamcloud.com/47830

Test-Parameters: trivial testlist=sanity-lnet
HPE-bug-id: LUS-11060
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I9111645f1290c8af4937d1b2689a068df81922a4
Reviewed-on: https://review.whamcloud.com/48220
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

LU-15925 lnet: add debug messages for IB

If net debug is enabled, information about connection, when
tx status is ECONNABORTED, is collected (only for IB).

Lustre-change: https://review.whamcloud.com/47583
Lustre-commit: 9153049bdc7ec8217691481df64551e2768455a9

Test-Parameters: trivial
Signed-off-by: Cyril Bordage <cbordage@whamcloud.com>
Change-Id: I44a33703931630b85cc0e847e2a038217b7967c6
Reviewed-on: https://review.whamcloud.com/48042
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

LU-15393 lod: skip qos for qos_threshold_rr=100

Current implementation of qos allocation is called for
every statfs update. It takes lq_rw_sem for write and
recalculate penalties, even whith setting qos_threshold_rr=100.
Which means always use rr allocation. Let's skip unnecessary
locking and calculation for 100% round robin allocation.

Lustre-change: https://review.whamcloud.com/46388
Lustre-commit: 2f23140d5c1396fd0b247bd7f9c249f6e24096b7

HPE-bug-id: LUS-10388
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: I2fcc272d00a988ca4ba0f745b1d5809d65b28654
Reviewed-on: https://review.whamcloud.com/48206
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

LU-15393 lod: use killable semaphore for creation path

lod_ost_alloc_qos() function sleeps during ost failover, but object
allocation could use different OSTs. The patch changes
down_write call to down_write_killable and adds timer for a
wakeup.

The main idea of this fix is next, when OST is lost during
lod_ost_alloc_rr() and MDT does not have precreated objects for it
lod_ost_alloc_rr()->..->lod_qos_declare_object_on() would sleep while
holding a lq_rw_sem for read. Any creation thread would stuck at
lod_ost_alloc_qos() waiting lq_rw_sem for write, after statfs update.
Whith a fix sleep is limited and allocation would going through
lod_ost_alloc_rr(). For read lq_rw_sem is shared and stripe allocation
would skip OST without objects.

lod_ost_alloc_rr() refills OST pool with a lq_rw_sem for write, when
lq_rr.lqr_flags has LQ_DIRTY. This should happen only when OST is
added/removed. No need to set LQ_DIRTY for lq_rr when statfs get
error, this flag does not affect any change for pool list at
lod_qos_calc_rr().

Change behaviour for lod_check_and_reserve_ost(), it would sleep
during object allocation for speed 2 only.

Lustre-change: https://review.whamcloud.com/45921
Lustre-commit: f46782b4c7dcaacd0046ebad3e3d84c2bb0367d4

HPE-bug-id: LUS-10388
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: I4768c4cf7d2f9f02f0a9e0dfb6d15e02932cb5fe
Reviewed-on: https://review.whamcloud.com/48194
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

LU-15880 quota: fix issues in reserving quota

Calling "chgrp" with unprivileged user will reserve quota space
before changing the GID of the file, and the reserved quota space
will be freed after its transaction is committed. there are some
issues in the current implementation,
1, the reserved quota isn't freed in case of error in "mdd_attr_set"
   and "tgt_cb_last_committed".
2, during freeing the reserved quota, the quota space to free is
   set as the same parameter as reserving the quota, which could
   be wrong, for instance, the reserving quota space will be 0 if
   the corresponding quota ID isn't enforces, but the call will
   return without error.

Like the "qsd_op_begin/qsd_op_end", the patch also adds reference to
the lquota_entry gotten during reserving quota and release it during
freeing the reserved quota to prevent potential issue.

Lustre-change: https://review.whamcloud.com/47425
Lustre-commit: 40daa59ac41f450b60b42eb2bb0ff42ebd3c998b

Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Change-Id: I098cde7d5e89fe8b9eaab0ae4bc285a4ac6c2281
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/47944
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

RM-620 build: New tag 2.14.0-ddn57

New tag 2.14.0-ddn57

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: If22ae9728b7bf32135e118e5c0d30d32a10216de

LU-15888 build: Debian dkms-debs requires ed and libkeyutils

dkms install/build needs dependencies on libmount-dev,
libkeyutils1, and libkeyutils-dev

Debian does not install the 'ed' package by default.
Without the 'ed' package the version is not correctly added
to the changelog and parsed to the package names.

Debian does not have linux-image or linux-headers psuedo
packages so require the arch specific ones, ex:
linux-image | linux-image-amd64 | linux-image-arm64
and:
linux-headers | linux-headers-amd64 | linux-headers-arm64
respectively.

o2ib fails to find Debian in-kernel Module.symvers and
should check $LINUX_OBJ/Module.symvers before failing.

Lustre-change: https://review.whamcloud.com/47455
Lustre-commit: 7dc6e1128a030c7e12eb23ea41935ed6fe77ce1f

HPE-bug-id: LUS-10984
Test-Parameters: trivial
Fixes: 85a6eebeca1 ("LU-15652 build: On Debian detect -common kernel headers")
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I82e2689f3af4b9ce106ee3ab6b4109d2709c8872
Reviewed-on: https://review.whamcloud.com/47968
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

LU-15913 mdt: disable parallel rename for striped dirs

Parallel rename should not be done within striped directories to
avoid remote updates.  These are like cross-directory renames.

Add tunables for parallel directory rename in case of problems.
These can be configured separately for files and directories.

    mdt.*.enable_parallel_rename_dir
    mdt.*.enable_parallel_rename_file

Lustre-change: https://review.whamcloud.com/47593
Lustre-commit: f238540c879dc668e18cf99cba62f117ccae64d6

Fixes: 90979ab390 ("LU-12125 mds: allow parallel directory rename")
Fixes: d76cc65d5d ("LU-12125 mds: allow parallel regular file rename")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I384976cd1c9f401169336ee7a479ba0e3dd9f4ee
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/48124
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>

LU-15994 tests: add testing for io_uring via fio

This patch adds test case for io_uring I/O engine via fio.

Lustre-change: https://review.whamcloud.com/48167
Lustre-commit: TBD(023160bfe79583f3d11d98d89df33f88fe6ffd12)

Test-Parameters: trivial testlist=sanity
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I0f2e371f91c02dc76644f42e5d1055ec200597c6
Reviewed-on: https://review.whamcloud.com/48168
Reviewed-by: Colin Faber <cfaber@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

LU-14472 quota: skip non-exist or inact tgt for lfs_quota

The nonexistent or inactive targets (MDC or OSC) should be skipped
for "lfs quota".

Lustre-change: https://review.whamcloud.com/41771
Lustre-commit: b54b7ce43929ce7ff6e48cd219623c264ca6b6b3

Change-Id: I25eece413715e4e05dd94ccbfd101220da7477f9
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/48171
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

LU-15994 llite: use fatal_signal_pending in range_lock

FIO io_uring failed with one file shared by two FIO processes
under Unubtu 2204 kernel.
After analyzed, we found that range_lock() function returns
-ERESTARTSYS when there is pending signal on current process in
Lustre I/O. This causes -EINTR returned to the application.

The reason that we have pending signal is because that io_uring
is using signal based task_work running in new kernel. Thus when
I/O process tries to acquire range lock and checks whether
signal_pending(), it may always find a pending signal and return
-ERESTARTSYS.

we solve this bug by replacing @signal_pending(current) with
@fatal_signal_pending(current) in range_lock(). The range_lock()
function only returns -ERESTARTSYS when the current process has
fatal pending signal such as SIGKILL.

Lustre-change: https://review.whamcloud.com/48106
Lustre-commit: TBD(91244566e9d2762c2f64a67e7d4fad8f301b556c)

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I0a0be8fa3b4ba5c89f7866286b2bdc6595f18026
Reviewed-on: https://review.whamcloud.com/48111
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

RM-620 build: New tag 2.14.0-ddn56

New tag 2.14.0-ddn56

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ifc36f35eab332d699990371dabaf8ab5ca7bf44e

LU-15714 pcc: reserve layout intent flags for PCCRO

Reserve the following layout intent flags for PCCRO:
LAYOUT_INTENT_PCCRO_SET = 7, /** set read-only layout for PCC */
LAYOUT_INTENT_PCCRO_CLEAR = 8, /** clear read-ony layout */

Lustre-change: https://review.whamcloud.com/46981
Lustre-commit: TBD (from 65374767dd4499f64cbc1c8182d333b1155e3272)

Test-Parameters: trivial
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: If8a414103ab13155aa483179247c81908b6ced69
Reviewed-on: https://review.whamcloud.com/47000
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

LU-15850 lmv: always space-balance r-r directories

If the MDT free space is imbalanced, use QOS space balancing for
round-robin subdirectory creation, regardless of the depth
of the directory tree. Otherwise, new subdirectories created
in parents with round-robin default layout may suddenly become
"sticky" on the parent MDT and upset the space balancing and
load distribution.

Add sanity/test_413h to check that round-robin dirs always balance.

Lustre-change: https://review.whamcloud.com/47578
Lustre-commit: 37c1ddc34d3a1e61c5533f48cb29fe2258ca2907

Test-Parameters: testlist=sanity env=ONLY=413h,ONLY_REPEAT=100
Fixes: 38c4c538f5 ("LU-15216 lmv: improve MDT QOS space balance")
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Ia1d0b5b1a027cf14236f93ae34b5cf4929e76d23
Reviewed-on: https://review.whamcloud.com/47871
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>