Whamcloud - gitweb
fs/lustre-release.git
9 months agoLU-16982 ldiskfs: Fix crash after "umount -d -f /mnt/..." 60/51760/5
Vitaliy Kuznetsov [Fri, 28 Jul 2023 15:40:14 +0000 (19:40 +0400)]
LU-16982 ldiskfs: Fix crash after "umount -d -f /mnt/..."

This patch adds an extra state check during the unmount process;
Since there was the following problem:
Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds4
kernel BUG at fs/jbd2/transaction.c:378!
CPU: 0 PID: 310834 Comm: kworker/0:2 4.18.0-477.15...
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
Workqueue: events flush_stashed_stats_work [ldiskfs]
RIP: 0010:start_this_handle+0x22c/0x520 [jbd2]
Call Trace:
 jbd2__journal_start+0xee/0x1f0 [jbd2]
 jbd2_journal_start+0x19/0x20 [jbd2]
 flush_stashed_stats_work+0x36/0x90 [ldiskfs]
 process_one_work+0x1a7/0x360
 worker_thread+0x30/0x390
 kthread+0x134/0x150

Fixes: e27a7b33d6 ("LU-16298 ldiskfs: Periodically write ldiskfs superblock")
Signed-off-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Change-Id: I162d3416ca1fe9bd09f1102ccca892db05719016
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51760
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
9 months agoLU-13805 tests: add unaligned io to multiop 90/49990/24
Patrick Farrell [Tue, 14 Feb 2023 18:29:09 +0000 (13:29 -0500)]
LU-13805 tests: add unaligned io to multiop

Add memory unaligned IO support to multiop.

This will be used by tests for unaligned DIO.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I38c049690610d34564a15e57f37c052105ab2066
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49990
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-16954 llite: do not set SB_I_CGROUPWB on super block 01/51701/3
Li Dongyang [Wed, 26 Jul 2023 10:52:24 +0000 (20:52 +1000)]
LU-16954 llite: do not set SB_I_CGROUPWB on super block

On clients with a more recent kernel e.g. ubuntu2204,
this makes the mount fails sometimes with
sysfs: cannot create duplicate filename '/devices/virtual/bdi/lustre-ffff8dd549f3d000'

Change-Id: Ie15e41eb9d039829545e1d69f97ed9e13f89e53e
Fixes: f5a75ea44d ("LU-16697 llite: Set BDI_CAP_* flags for lustre")
Test-Parameters: clientdistro=ubuntu2204 testlist=sanity,conf-sanity
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51701
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoNew tag 2.15.57 2.15.57 v2_15_57
Oleg Drokin [Tue, 1 Aug 2023 06:17:39 +0000 (02:17 -0400)]
New tag 2.15.57

Change-Id: Ice12bbb65d4d455b2beea14e83a9ab663bda237c

9 months agoLU-16983 mdc: check errcode prior mdc_fill_lvb() call 61/51761/2
Mikhail Pershin [Tue, 25 Jul 2023 22:09:31 +0000 (01:09 +0300)]
LU-16983 mdc: check errcode prior mdc_fill_lvb() call

The mdc_enqueue_fini() can be called with negative
errcode parameter if request processing was failed.
In that case the mdc_fill_lvb() shouldn't be called.

Issue may occur with DoM files, old server (<2.14) and
new client. The problem is in new client code.

Test-Parameters: testlist=racer serverversion=EXA5.2.8
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I884398beada4286bc07875247e15b41120f73a3e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51761
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-16979 utils: enable throttling mirror extend 58/51758/2
Alex Zhuravlev [Tue, 25 Jul 2023 15:07:19 +0000 (18:07 +0300)]
LU-16979 utils: enable throttling mirror extend

this can be useful in some scenarios like massive mirror
creation.

Test-Parameters: trivial
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Ia84054f3519cd5cef37aaabb2ae605fb6ea200e0
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51758
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
9 months agoLU-16796 obdclass: Change struct jobid_pid_map to use refcount_t 47/51747/3
Arshad Hussain [Sun, 23 Jul 2023 05:40:04 +0000 (11:10 +0530)]
LU-16796 obdclass: Change struct jobid_pid_map to use refcount_t

This patch changes struct jobid_pid_map to use
refcount_t(kref) instead of atomic_t

Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: Ia3458d5605a8cff2bb65476495c321fa98cf01dc
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51747
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-16969 build: check that pkg-config is installed 16/51716/3
Timothy Day [Wed, 19 Jul 2023 17:23:24 +0000 (17:23 +0000)]
LU-16969 build: check that pkg-config is installed

PKG_CHECK_MODULES macro fails in very annoying to debug ways.
Often, this will fail with:

 syntax error near unexpected token `LIBNL3,'
 ` PKG_CHECK_MODULES(LIBNL3, libnl-genl-3.0 >= 3.1)'

and provide no indication that the real error is that
pkg-config is not installed. Adding an explicit check
for pkg-config will make the error more self-evident.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
Change-Id: Ic2ee8e4c3ec3fa2e03c5ece03e6a9ce335133578
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51716
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
9 months agoLU-16958 llite: migrate vs regular ops deadlock 41/51641/2
Bobi Jam [Wed, 12 Jul 2023 15:05:27 +0000 (23:05 +0800)]
LU-16958 llite: migrate vs regular ops deadlock

When it need to lock inode in lov_conf_set(), it could have hold
inode's lli_layout_mutex, we need unlock the layout mutex before
taking its inode lock to keep the lock order.

Fixes: 51d62f2122f ("LU-16637 llite: call truncate_inode_pages() in inode lock")
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I7ee58039a6d31daefc625ac571a52baf112f8151
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51641
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-16878 tests: use RUNAS_UID / RUNAS_GID for NRS TBF 38/51238/5
James Simmons [Thu, 20 Jul 2023 19:57:36 +0000 (15:57 -0400)]
LU-16878 tests: use RUNAS_UID / RUNAS_GID for NRS TBF

Some of the sanityn NRS TBF test hardcode the use of uid 500 and gid
500. They are not guaranteed to exist so use RUNAS_UID and RUNAS_GID
instead.

Test-Parameters: trivial testlist=sanityn env=ONLY=77
Change-Id: Ie987c70e94918c5cddadb632a4a3a3caac12c96f
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51238
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-16796 libcfs: Remove reference to LASSERT_ATOMIC_ZERO 04/51004/6
Arshad Hussain [Tue, 16 May 2023 03:00:49 +0000 (08:30 +0530)]
LU-16796 libcfs: Remove reference to LASSERT_ATOMIC_ZERO

This patch removes all reference to LASSERT_ATOMIC_ZERO macro.

Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I73259599d1dee6277fadf66181699f1282274a80
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51004
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-16430 ptlrpc: racy rq_obsolete bit modification 05/49505/6
Andriy Skulysh [Thu, 24 Nov 2022 13:18:04 +0000 (15:18 +0200)]
LU-16430 ptlrpc: racy rq_obsolete bit modification

Racy bit modification causes assertion failure in
ptlrpc_at_remove_timed():
ASSERTION( !list_empty(&req->rq_srv.sr_timed_list) )

rq_obsolete is a bit field, so it's modification
isn't atomic and should be modified under rq_lock.

Change-Id: Ib1d3ad189a78b71ecf5b01585478922e984c9568
HPE-bug-id: LUS-11368
Fixes:  23773b32bf ("LU-11444 ptlrpc: resend may corrupt the data")
Signed-off-by: Andriy Skulysh <andriy.skulysh@hpe.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49505
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-16053 build: Update zfs configure checks 89/48089/10
Shaun Tancheff [Mon, 22 May 2023 12:36:21 +0000 (07:36 -0500)]
LU-16053 build: Update zfs configure checks

From Brian Behlendorf <behlendorf1@llnl.gov>:

update dmu_*_by_dnode checks
 Provided as a feature since ZFS 0.7.0, convert to a fatal configure
 error when unavailable.

update zap_*_by_dnode checks
 Provided as a feature since ZFS 0.7.0, convert to a fatal configure
 error when unavailable.

update multihost protection check
 Provided as a feature since ZFS 0.7.0, convert to a fatal configure
 error when unavailable.  Drop the compatibility code required to
 support OpenZFS releases older than 0.7.0.

update userobj accounting check
 Provided as a feature since ZFS 0.7.0, convert to a fatal configure
 error when unavailable.

update dmu_prefetch() check
 Provided since at least ZFS 0.7.0, convert to a fatal configure
 error when unavailable.

update dmu_object_alloc_dnsize() check
 Provided since at least ZFS 0.7.0, convert to a fatal configure
 error when unavailable.

update spa_maxblocksize() check
 Provided since at least ZFS 0.7.0, convert to a fatal configure
 error when unavailable.

update dsl_pool_config_enter/exit check
 Convert to a fatal configure error, these functions have
 been provided since at least ZFS 0.7.x.

replace sa_spill_block() check
 The sa_spill_block() function was removed after the ZFS 0.6.x
 release.  Replace the check with one for use zio_buf_alloc/free
 which have been available since 0.7.x.

 The dsl_sync_task_do_nowait() function has not been provided
 by since the 0.6.x releases.  Furthermore, the results of this
 check are unused by Lustre so let's just remove it.

Test-Parameters: trivial
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I3c1597e56100961178f9001e918ffb9aa3558706
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48089
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-6142 obd: remove OBP and MDP macros 39/51739/2
Timothy Day [Fri, 21 Jul 2023 20:16:03 +0000 (20:16 +0000)]
LU-6142 obd: remove OBP and MDP macros

These macros save very little space, make it harder
to understand the code (by adding one more thing to
remember) and make it impossible to grep for
o_* and m_* functions. Luckily, they are only used in
a few places. So, remove them and all references.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I4c23199ca53c906ca190a81ffdf916ff6cff9a0b
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51739
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-6142 obd: fix white space, header 38/51738/2
Timothy Day [Fri, 21 Jul 2023 20:29:32 +0000 (20:29 +0000)]
LU-6142 obd: fix white space, header

Convert all of the remaining spaces to tabs. Also,
add SPDX text to file.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I2d4e71646f7aaa286f7500564c817c76a4b716ed
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51738
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-11036 test: race in sanity-lfsck test_8 20/51720/3
Lai Siyao [Mon, 10 Jul 2023 04:30:28 +0000 (00:30 -0400)]
LU-11036 test: race in sanity-lfsck test_8

In sanity-lfsck test_8, "sleep 1" is run after START_NAMESPACE,
but it still has chance that LFSCK status is complete but LFSCK
thread not quit yet, therefore the following START_NAMESPACE may fail
with -EALREADY. Just check the first lfsck started scanning.

And similarly use wait_update to check flags for DELAY3.

Test-Parameters: trivial MDSCOUNT=2 MDTCOUNT=4 testlist=sanity-lfsck env=ONLY=8,ONLY_REPEAT=10
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Ie1f612bebb52c4755e5b4e13d58ab8bf2aeb2832
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51720
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-9680 utils: add updating the key table for Netlink. 15/51715/3
James Simmons [Wed, 19 Jul 2023 16:07:26 +0000 (12:07 -0400)]
LU-9680 utils: add updating the key table for Netlink.

Currently lnetconfig implementation only sends the key table once
to construct a YAML document. Add the ability to update the key
table at a latter time. New keys will be used by the YAML
document.

Test-Parameters: trivial
Change-Id: Ie2201f91eb24d06c7e2a2d4abe3da3805f74e5a7
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51715
Tested-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-9680 contrib: share libyaml C code generator 08/51508/5
James Simmons [Mon, 24 Jul 2023 16:10:17 +0000 (12:10 -0400)]
LU-9680 contrib: share libyaml C code generator

Writing proper libyaml C code is not easy. So I wrote an
application that generates the C code to help the developer not
struggle starting from scratch. It wouldn't be a one to one
copy and paste but it greatly helps. The build the application
just do gcc -lyaml yaml-event-dump.c

Test-Parameters: trivial
Change-Id: I1b570dbfc3ea2e6a7ec77b3743aa4cd80aba2acb
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51508
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-16843 ldiskfs: merge extent blocks 96/51096/14
Alex Zhuravlev [Tue, 23 May 2023 13:30:58 +0000 (16:30 +0300)]
LU-16843 ldiskfs: merge extent blocks

There are cases (e.g. file written synchronously with discontiguous
blocks that are later filled in) when a lot of extents are created
initially, then the extents get merged over time, but there is no
way to merge the index blocks.  This can cause a very deep extent
index tree (above 5 levels) and cause problems like:

inode has invalid extent depth: 6

Merge leave/index blocks (one at each level at most) to right/left
when extents are removed from the index.

submitted to ext4@ maillist:
https://lore.kernel.org/linux-ext4/7A2B8861-96AA-4815-BB58-180F63F62436@whamcloud.com/

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I746c0917e746eb442d3c69a23f591d9cdade76fa
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51096
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-16965 obd: remove unused obd_evict_inprogress 81/51681/2
Timothy Day [Fri, 14 Jul 2023 15:42:39 +0000 (15:42 +0000)]
LU-16965 obd: remove unused obd_evict_inprogress

Remove the atomic_t struct field obd_evict_inprogress
from 'struct obd_device'. This field was only ever
incremented in a unused function that was removed in
a previous patch. Hence, remove it altogther. This
patch also removes the associated wait queue.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Id151c1e6a0adde8c1aeb6dbc903b9d98d00fd21d
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51681
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
9 months agoLU-15553 test: mkdir_on_mdt0 in recovery-small.sh 69/51669/3
Lai Siyao [Sat, 8 Jul 2023 20:35:43 +0000 (16:35 -0400)]
LU-15553 test: mkdir_on_mdt0 in recovery-small.sh

Many subtests in recovery-small.sh requires test dir be created on
MDT0, replace mkdir with mkdir_on_mdt0.

Fixes: b9c4dc3c33 ("LU-14792 llite: enable filesystem-wide default LMV")

Test-Parameters: trivial
Test-Parameters: testlist=recovery-small,recovery-small,recovery-small
Test-Parameters: MDSCOUNT=2 MDTCOUNT=4 testlist=recovery-small,recovery-small,recovery-small
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Ibc37b2dd25bcd94794392f5ff8a79df2e7932dcc
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51669
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-16951 test: don't call echo in function call 15/51615/2
Hongchao Zhang [Thu, 6 Jul 2023 17:16:09 +0000 (01:16 +0800)]
LU-16951 test: don't call echo in function call

In sanity-quota, the call '$(get_slave_nr expr "foo")' will fail
if there is "echo" call in "wait_update_facet/wait_update_cond".

Test-Parameters: trivial
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Change-Id: Ib35bf8ccd7eb121a0a2852ba7ed69ad9b01f271a
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51615
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-16947 tests: On error correctly kill multiop 89/51589/2
Arshad Hussain [Thu, 6 Jul 2023 10:25:04 +0000 (15:55 +0530)]
LU-16947 tests: On error correctly kill multiop

multiop_bg_pause under test-framework starts multiop
in background and waits for signal if "_" option is
provided. On 'verbose' mode the PAUSING string is
printed on console which is checked and if not found
error is reported by multiop_bg_pause function.

On error, it is required to kill the existing running
multiop binary and if not done will eventually timeout
and not exit the test.

Currently on error multiop_bg_pause function incorrectly
sends signal to wrong PID. This patch fixes this issue.

Test-Parameters: trivial testlist=replay-single mdscount=2 mdtcount=4
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I3fb505302615512a891725e7339a6f0238c2cdab
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51589
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-16663 tests: use python to compare YAML files 07/51507/6
James Simmons [Tue, 11 Jul 2023 17:28:13 +0000 (13:28 -0400)]
LU-16663 tests: use python to compare YAML files

For the sanity-lnet test we often compare different YAML files.
This is done with diff which is the wrong tool since two YAML
files that are the same can have different indentations. The
libyaml maintainer states using python tools for this is the
proper supported way to do this.

Test-Parameters: trivial testlist=sanity-lnet
Change-Id: Ie0ef623e8ec729aaad862fc3f33eb0a3b4172fad
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51507
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Feng Lei <flei@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
9 months agoLU-16911 sec: quiet messages from identity upcall retry mech 55/51355/3
Sebastien Buisson [Mon, 19 Jun 2023 11:38:07 +0000 (13:38 +0200)]
LU-16911 sec: quiet messages from identity upcall retry mech

Do not use CERROR to print messages about failed identity acquire
upcalls. And make a difference between initial attempt before retry,
and final failure.

Fixes: 61c3b3a9bb ("LU-16165 sec: retry mechanism for identity cache")
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I35e04ca31b623d6037bb49e4ded4ea96d653f074
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51355
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-16483 tests: replay-single test_200 fixes 91/50891/11
Chris Horn [Thu, 4 May 2023 22:48:28 +0000 (17:48 -0500)]
LU-16483 tests: replay-single test_200 fixes

Modify test to ensure idle disconnect is enabled for all targets
except OST0000. This prevents an issue where an idle ping is sent to
another target instead of OST0000.

Re-work test to check the debug log for all relevant messages.

rcli is not set correctly when RCLIENTS contains multiple hostnames.
Fix it by not surrounding RCLIENTS with double quotes.

Added a debug statement to ptl_send_rpc(), and moved an existing one,
to faciliate debugging any future test failures.

Test-Parameters: trivial clients=3 testlist=replay-single env=ONLY=200,ONLY_REPEAT=100
Fixes: eb1f4a5 ("LU-16483 ptlrpc: Track highest reply XID")
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: If0a214092dad1e40f1b9e785e179ef67f686b85a
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50891
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-15527 dne: refactor commit-on-sharing for DNE 41/46641/11
Lai Siyao [Sat, 13 May 2023 08:25:53 +0000 (04:25 -0400)]
LU-15527 dne: refactor commit-on-sharing for DNE

Commit-on-sharing for DNE is different from the original
commit-on-sharing:
* the original commit-on-sharing is to eliminate dependency between
  operations from different clients.
* while commit-on-sharing for DNE is to eliminate dependency between
  operations handled by different MDTs, so that upon multiple MDT
  failures, an operaiton replay won't fail because its dependent
  operation is not replayed by another MDT yet.

Current CoS for DNE implementation checks dependency in MDT layer, and
it decides by checking whether current operation is a distributed
transaction, if so, it will trigger CoS upon conflicting locks.
Actually this may miss some cases that should trigger CoS (even local
transaction should trigger CoS if it depends on a distributed
transaction), and on the other hand it may trigger extra CoS because
if two operations are handled by the same MDT, the dependency is
ensured because they will always be replayed by transaction number.
And to avoid mixing the code of two different CoS, the following
changes are made:
* add new ldlm lock mode LCK_TXN. On DNE system, downgrade PW/EX locks
  to this mode after transaction stop.
* add li_initiator_id in struct ldlm_inodebits, which is the index of
  MDT where the lock is enqueued, i.e. where operation is handled. If
  another operation handled by a different MDT requests a conflicting
  PW|EX mode lock against this TXN mode lock, it will trigger commit
  to ensure the dependent operation is committed to disk (NB, it
  doesn't trigger commit on all involved MDTs, but only the MDT where
  the conflict happens, which is enough to allow replay succeed).
* remove LDLM_FL_COS_INCOMPAT and LDLM_FL_COS_ENABLED.
* MDT layer doesn't need to check such dependency any more, since lock
  itself knows.
* updated sanityn 33c, 33d and 33e since fewer CoS are triggered now.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Ib0149fcdc0178afd2c6894d211480f3c6c9284a0
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/46641
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-13805 llite: Improve sync_io comments 67/50167/15
Patrick Farrell [Wed, 1 Mar 2023 15:43:49 +0000 (10:43 -0500)]
LU-13805 llite: Improve sync_io comments

Correct and improve comments on cl_sync_io_wait_recycle.

Test-parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I7784aa75df46831d1b501c823ec1ada48376b227
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50167
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
9 months agoLU-16405 osd: lookup cache 21/50521/36
Alex Zhuravlev [Mon, 3 Apr 2023 10:32:59 +0000 (13:32 +0300)]
LU-16405 osd: lookup cache

MDT may need to re-lookup just checked names (after locking).
introduce a trivial tiny per-thread cache in OSD in order to
make such a repeating lookup cheap.

the original issue is that ext4_add_entry() doesn't really
check for possible duplicate (that would be expensive as
a whole 4K block must be scanned).

important: the cache is reset upon request processing completion as
we don't update iversion on a disk (due to conflict with VBR).

Fixes: 79acb9a9e7 ("LU-10235 mdt: mdt_create: check EEXIST without lock")
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I40c3ee702f7895c3bda00b380f904cd587e0a1c4
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50521
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-15535 llite: deadlock on lli_lsm_sem 89/50489/11
Vitaly Fertman [Fri, 31 Mar 2023 18:04:44 +0000 (21:04 +0300)]
LU-15535 llite: deadlock on lli_lsm_sem

it may happen that one process is doing lookup, and after reply while
holding the LDLM lock is trying to update LSM/default LSM under the
write lli_lsm_sem for a dir.

another process has taken the read lli_lsm_sem (taken for all the MD
ops in ll_prep_md_op_data()) and is waiting for a conflicting PW LDLM
lock on server for its modification for this dir.

it may happen on restriping with LSM, on changing the default LSM, but
even more often way is racer run even without striped dirs:
- racer does LFS mkdir -i $i <subdir> per each MDS, what creates a default
  LSM on these subdirs inherited endlessly - to keep the MDS index;
- racer also does mkdir -p <path>, in which case we do:
ll_new_node - create a parent dir, no RMF_DEFAULT_MDT_MD in reply
ll_lookup parent it=open - no RMF_DEFAULT_MDT_MD in reply
ll_new_node - create a child
the default LSM is inherited on the parent creation, however as those RPCs
do not have lookup LDLM lock and no data - the default layout is not set
for the parent in inode at the time of a child creation. thus a parallel
lookup which gets the LSM deadlocks with this ll_new_node().

at the same time, similar to CLIO, we do not need to hold a sem nor an
LDLM lock over the whole operation to avoid LSM modification on server,
we just need to take an uptodate LSM (this is a subject for LU-16320)
and to guarantee this op will be working on the client on this LSM for
the whole operation.

the solution is to let MD ops to work on a copy of LSM therefore letting
others to modify LSM attached to inode in parallel if needed.

HPE-bug-id: LUS-10725
Signed-off-by: Vitaly Fertman <vitaly.fertman@hpe.com>
Change-Id: I3137300b5bcce2e890994ce8751cdf7fce2f3f54
Reviewed-on: https://es-gerrit.hpc.amslabs.hpecorp.net/161525
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Tested-by: Vitaly Fertman <c17818@cray.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50489
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-15535 revert: "LU-15284 llite: access lli_lsm_md with lock in all places" 88/50488/4
Vitaly Fertman [Fri, 31 Mar 2023 17:30:27 +0000 (20:30 +0300)]
LU-15535 revert: "LU-15284 llite: access lli_lsm_md with lock in all places"

This reverts commit 1dfae156d1dbc11cfb77b2d35cbffb2da7f28137
as a prerequisite of a larger fix in this ticket which covers
this problem as well.

Signed-off-by: Vitaly Fertman <vitaly.fertman@hpe.com>
Change-Id: Ic1b0b6c963ea96e9f51324625deaa851245f8a7d
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50488
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
10 months agoLU-16667 build: struct mnt_idmap, linux/filelock.h 20/50420/11
Shaun Tancheff [Tue, 11 Jul 2023 12:42:27 +0000 (19:42 +0700)]
LU-16667 build: struct mnt_idmap, linux/filelock.h

Linux commit v6.2-rc3-9-g5970e15dbcfe
  filelock: move file locking definitions to separate
            header file

Add configure test for linux/filelock.h and include it
where needed.

linux kernel v6.2-rc1-4-gb74d24f7a74f
  fs: port ->getattr() to pass mnt_idmap
linux kernel v6.2-rc1-3-gc1632a0f1120
  fs: port ->setattr() to pass mnt_idmap

Add a configure test for mnt_idmap and fallback to using
user_namespace for older kernels.

Test-Parameters: trivial
HPE-bug-id: LUS-11557
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Ib8cbb3157fb11b4f1fc55f1626c2998cb202bd8c
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50420
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-15365 tests: conf-sanity/115 to cleanup properly 36/45836/5
Alex Zhuravlev [Mon, 13 Dec 2021 11:47:59 +0000 (14:47 +0300)]
LU-15365 tests: conf-sanity/115 to cleanup properly

when /tmp can't fit large MDT filesystem the image
should be removed after all.

Test-Parameters: trivial testlist=conf-sanity env=ONLY=115
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Ifb0bd201156f4beb665f3c38aa02d44802b13bbf
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/45836
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-16916 tests: fix client_evicted() not to ignore EOPNOTSUPP 67/51667/3
Jian Yu [Fri, 14 Jul 2023 03:31:21 +0000 (11:31 +0800)]
LU-16916 tests: fix client_evicted() not to ignore EOPNOTSUPP

After RHEL 9.x or Ubuntu 22.04 client is evicted, "lfs df" returns
error code 95 (EOPNOTSUPP), which is ignored in check_lfs_df_ret_val()
and then causes client_evicted() to ingore that error.

This patch fixes client_evicted() to check the return value
from "lfs df" directly so as not to ignore EOPNOTSUPP.

Test-Parameters: trivial clientdistro=el9.2 testlist=replay-vbr
Test-Parameters: trivial clientdistro=el8.8 testlist=replay-vbr
Test-Parameters: trivial clientdistro=ubuntu2204 testlist=replay-vbr

Change-Id: I633ae8769fc563b8068f433e2afae29463ac5553
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51667
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-15046 tests: skip sanity-flr/test_200c for old MDS 49/51649/2
Alex Deiter [Wed, 12 Jul 2023 20:58:02 +0000 (00:58 +0400)]
LU-15046 tests: skip sanity-flr/test_200c for old MDS

Skip sanity-flr test_200c for old MDS missing the fix
for synchronized replicas and its corresponding test.

Fixes: b7ec0d2390 ("LU-15046 osp: precreate thread vs connect race")
Test-Parameters: trivial testlist=sanity-flr env=ONLY=200c
Signed-off-by: Alex Deiter <adeiter@tintri.com>
Change-Id: I87cd7d6b767086f993a27ce6905b05f87e325474
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51649
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sarah Liu <sarah@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-16341 tests: skip sanity-quota/test_1b for old MDS 48/51648/2
Alex Deiter [Wed, 12 Jul 2023 20:36:37 +0000 (00:36 +0400)]
LU-16341 tests: skip sanity-quota/test_1b for old MDS

Skip sanity-quota test_1b for old MDS missing the fix
for LU-16341 kernel NULL in qmt_site_recalc_cb.

Fixes: d965d63415 ("LU-16341 quota: fix panic in qmt_site_recalc_cb")
Test-Parameters: trivial testlist=sanity-quota env=ONLY=1b
Signed-off-by: Alex Deiter <adeiter@tintri.com>
Change-Id: I1b1bc3fdfa8f36b0c20a9a06721735c8e02c034c
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51648
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
10 months agoLU-16913 quota: fix ASSERTION(lqe->lqe_gl) 29/51629/2
Sergey Cheremencev [Tue, 11 Jul 2023 14:28:12 +0000 (18:28 +0400)]
LU-16913 quota: fix ASSERTION(lqe->lqe_gl)

It is possible to add in a 2nd time lqe into qmt_reba_list while
handling of the 1st from the 1st time is not finished. There is a
small window in qmt_id_lock_glimpse when lqe_link is empty but
lqe_gl is not set.

Signed-off-by: Sergey Cheremencev <scherementsev@ddn.com>
Change-Id: I1168903bff88df7e5106186b082e8065a6480367
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51629
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-16953 tests: wait longer in replay-dual/test_31 21/51621/3
Lei Feng [Tue, 11 Jul 2023 00:13:35 +0000 (08:13 +0800)]
LU-16953 tests: wait longer in replay-dual/test_31

Wait until file was created in replay-dual/test_31.

Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: trivial testlist=replay-dual env=ONLY=31,ONLY_REPEAT=100
Change-Id: I847beb51d53e667f1599c9693aa5eb099dcf9435
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51621
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-16944 utils: lfs find: handle multiple paths correctly 78/51578/2
Thomas Bertschinger [Wed, 5 Jul 2023 14:19:22 +0000 (10:19 -0400)]
LU-16944 utils: lfs find: handle multiple paths correctly

When lfs find is used with multiple paths and the first non-path
option is a '!' or an option without an argument like '-print',
the code skipped the final path because it assumed that the first
non-path option would be an option with an argument.

This commit resolves the bug by remembering the last-processed argv
index so that the index of the final path argument is correct whether
the next option consumes 2 indexes or just 1.

Signed-off-by: Thomas Bertschinger <bertschinger@lanl.gov>
Change-Id: I03133a43641af7a53a20d947b8ef82529e453251
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51578
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-16893 libcfs: Remove force_sig usage from lfsck 70/51470/4
Shaun Tancheff [Tue, 27 Jun 2023 08:10:19 +0000 (15:10 +0700)]
LU-16893 libcfs: Remove force_sig usage from lfsck

The lfsck pool of kernel threads uses force_sig() to signal
the worker threads to stop. A signal is used here as the
lfsck workers may be waiting in various, and possibly
nested, states.

As force_sig() has been removed let us simply enable SIGINT
to be passed to the worker threads using send_sig().

Test-parameters: testlist=sanity-lfsck,lfsck-performance
HPE-bug-id: LUS-11670
Fixes: db9f9543ec ("LU-12634 libcfs: force_sig() removed task parameter")
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Ibf6a67f43687960b3eff9cb9a7c7dc8b1be1da63
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51470
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
10 months agoLU-16796 libcfs: Remove reference to LASSERT_ATOMIC_GT_LT 57/51157/7
Arshad Hussain [Mon, 29 May 2023 07:31:26 +0000 (03:31 -0400)]
LU-16796 libcfs: Remove reference to LASSERT_ATOMIC_GT_LT

This patch removes all reference to LASSERT_ATOMIC_GT_LT macro.

Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I9acf820b32855e54369c18470fb1b73d7f08c41a
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51157
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-16831 lfs: limit stripe count for component size 43/51143/7
Bobi Jam [Wed, 21 Jun 2023 06:33:21 +0000 (14:33 +0800)]
LU-16831 lfs: limit stripe count for component size

If stripe count is larger than component_size/stripe_size, some
allocated OST objects are created but inaccessible. This patch
reduces the number of stripes in that case to avoid this.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I117ed8a7696c6c6adcdd0c2c6531a958cc53bd51
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51143
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-15969 llite: add support for ->fileattr_get/set 07/51107/4
Mr NeilBrown [Wed, 7 Dec 2022 07:55:57 +0000 (18:55 +1100)]
LU-15969 llite: add support for  ->fileattr_get/set

From Linux 5.13, FS_IOC_SETFLAGS and GETFLAGS aren't passed down to
the filesystem, we need ->fileattr_get/set inode_operations instead.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Change-Id: Ib3ffba3529ea32b702ad80abd4b9e4e3ad90b412
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51107
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
10 months agoLU-16824 ldiskfs: add support for openEuler 22.03 LTS SP1 78/50978/3
Xinliang Liu [Wed, 10 May 2023 10:08:38 +0000 (10:08 +0000)]
LU-16824 ldiskfs: add support for openEuler 22.03 LTS SP1

Add openEuler 22.03 LTS SP1 config target file.
Fix tiny conflicts for patch ext4-delayed-iput.patch and
ext4-data-in-dirent.patch.
Add missing patch ext4-encdata.patch.
Add build required pkg kernel-debugsource for ldiskfs build.

Change-Id: I68314c9df17ce991a5e46f2ed4746ce1703b1587
Test-Parameters: trivial
Signed-off-by: Xinliang Liu <xinliang.liu@linaro.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50978
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-16793 build: Enable compile tests to require <module>.ko 49/50849/2
Shaun Tancheff [Tue, 2 May 2023 10:29:44 +0000 (05:29 -0500)]
LU-16793 build: Enable compile tests to require <module>.ko

Currently the build tests only demand a kernel api test
create an object (.o).

Cases that have a missing symbol export, directly or
indirectly, will generate an object file and fail to
generate a kernel module (.ko).

Enable tests to select the stricter criteria.

Test-Parameters: trivial
Fixes: cc5594df3e ("LU-16759 o2ib: MOFED 5.5+ ib_dma_virt_map_sg")
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Iae481f1287023ea6c2432d147c497fa0a55fd689
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50849
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: xinliang <xinliang.liu@linaro.org>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-9859 llite: simplify parsing in pcc_conds_parse() 39/50839/6
Mr NeilBrown [Tue, 24 Nov 2020 06:16:38 +0000 (17:16 +1100)]
LU-9859 llite: simplify parsing in pcc_conds_parse()

If we duplicate the string to be parsed we can use standard parsing
tools like strsep() and strcmp().

Test-Parameters: trivial testlist=sanity-pcc
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Ia0d0fb4fbc8d85f1e47e6085392e0f84b000b8a8
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50839
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Feng Lei <flei@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-9859 ptlrpc: simplify nrs_tbf_jobid_list_parse() 38/50838/8
Mr NeilBrown [Tue, 24 Nov 2020 05:52:12 +0000 (16:52 +1100)]
LU-9859 ptlrpc: simplify nrs_tbf_jobid_list_parse()

If we duplicate the string passed to nrs_tbf_jobid_list_parse(), we
can parse using standard mutating tools.

Test-Parameters: trivial
Test-Parameters: testlist=sanityn env=ONLY=77
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Id04390c2ed0a26a0311a8bbc784979eb18f4d19d
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50838
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-9859 ptlrpc: simplify nrs_tbf_check_id_value and callers 36/50836/4
Mr NeilBrown [Tue, 24 Nov 2020 05:17:33 +0000 (16:17 +1100)]
LU-9859 ptlrpc: simplify nrs_tbf_check_id_value and callers

The string passed down to nrs_tbf_check_id_value() is always "token"
in nrs_tbf_id_parse(), which is in 'buffer' in nrs_tbf_parse_cmd() and
is modified in-place there and in
ptlrpc_lprocfs_nrs_tbf_rule_seq_write().

So it must be safe to modify the string in nrs_tbf_check_id_value()
too.  So change that function to parse it using the primitives
commonly used in the kernel such as strsep and strim.

Change all callers to use "char *" rather than "struct cfs_lstr".

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Ie67dba002530e4bdfb3c3601a15dc49904f1adcf
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50836
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-9859 ptlrpc: simplifying expression parsing in nrs_tbf 35/50835/13
Mr NeilBrown [Tue, 24 Nov 2020 04:53:46 +0000 (15:53 +1100)]
LU-9859 ptlrpc: simplifying expression parsing in nrs_tbf

The standard approach to parsing in the kernel is to modify strings as
needed, such as to nul-terminate substrings.

Lustre tends to pass around lengths instead, which means that various
kernel functions such as kstrtoNN() or strsep() or even strcmp()
cannot be used.

We can simplify code in nrs_tbf if we kstrdup() strings before parsing
them, and then use standard functions.

cfs_gettok() strips spaces while finding the token.  With this patch,
stripping of spaces is left to the final stage where an expression
(a={b}) is being parsed.  It might arrive with arbitrary space such as
" a ={ b  }  ".

A test in sanityn has some spaces added in various places to ensure
that are parsed correctly as an earlier version of this patch got some
of that wrong.

The list parsed in nrs_tbf_id_list_parse() can have multiple separator
(spaces) between elements, which contrasts with expressions which only
have a single '=" or "&" etc.

So strsep() might return an empty token between two consecutive
spaces.  This is not necessarily an error - it is only an error if
*all* tokens are empty.  So we add a "list_empty()" test at the end.

Test-Parameters: trivial
Test-Parameters: testlist=sanityn
Test-Parameters: testlist=sanity
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Id4fb399773e49e4869ca5ebf93fe63c864d82287
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50835
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-9859 lnet: move ioctl device to lnet 33/50833/10
Mr NeilBrown [Tue, 24 Nov 2020 00:42:44 +0000 (11:42 +1100)]
LU-9859 lnet: move ioctl device to lnet

The misc device "/dev/lnet" is currently managed in libcfs code,
despite that fact that it is named "lnet" and almost all ioctl
handlers are in lnet code.

So move the management of the device to lnet code, leaving just
the minimal amount in libcfs:
case IOC_LIBCFS_CLEAR_DEBUG:
case IOC_LIBCFS_MARK_DEBUG:

Also rename various parts of the interface from libcfs_ioctl* to
lnet_ioctl*.
ioctl names, data structures, and include files are left unchanged for
now.

Note that the return value from LNetCtl() was previously passed
through notifier_from_errno() and notifier_to_errno().  This had the
effect of turning any positive value to zero.  We need to preserve
that and not return positive results.  PING would return a positive
result.  lnd->lnd_ctl probably doesn't, but due to the difficulty of
auditing, it is safer to always force the result to non-positive.

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I9cf158f1f9d8b03687d85ba40bd88f1f0ab8e4b8
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50833
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-16786 utils: Replace open call to WANT_FD 64/50764/11
Arshad Hussain [Wed, 26 Apr 2023 05:51:15 +0000 (11:21 +0530)]
LU-16786 utils: Replace open call to WANT_FD

Replace open call to WANT_FD with newly added API
llapi_root_path_open() which was added under LU-16427

Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I97d55321cf32e40eaf7d6284c47f313199a6c406
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50764
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
10 months agoLU-16275 tests: Modify killall of replay-dual 25 77/48977/5
Kevin Zhao [Fri, 28 Oct 2022 06:55:52 +0000 (14:55 +0800)]
LU-16275 tests: Modify killall of replay-dual 25

The in-tree test will have rename the multiop to lt-multiop,
which will not be killed with killall. So it's better to add
regex to killall for kill all the process with multiop and
lt-multiop

Test-Parameters: trivial testlist=replay-dual env=ONLY=25
Signed-off-by: Kevin Zhao <kevin.zhao@linaro.org>
Change-Id: Ic9b064a6bb0d944eedb5dc019ba5d4d05c98eeae
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48977
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
10 months agoLU-16298 ldiskfs: Periodically write ldiskfs superblock 40/51340/11
Vitaliy Kuznetsov [Tue, 11 Jul 2023 08:12:25 +0000 (14:12 +0600)]
LU-16298 ldiskfs: Periodically write ldiskfs superblock

This patch introduces a mechanism to periodically check and update
the superblock within the ext4 file system. The main purpose of this
patch is to keep the disk superblock up to date. The update will be
performed if more than one hour has passed since the last update,
and if more than 16MB of data have been written to disk.

This check and update is performed within the
ext4_journal_commit_callback function, ensuring that the superblock
is written while the disk is active, rather than based on a timer
that may trigger during disk idle periods.

Signed-off-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Change-Id: I06eb9624b663a6ca6b15c6af2373b82f1bb63de6
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51340
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-16847 ldiskfs: ->fiemap replaced with ldiskfs_map_blocks. 46/51146/8
Alexey Lyashkov [Fri, 26 May 2023 10:15:36 +0000 (13:15 +0300)]
LU-16847 ldiskfs: ->fiemap replaced with ldiskfs_map_blocks.

lets avoid hacks with data copy from fiemap buffer to kernel
space by using an ldiskfs_map_blocks directly.
It might code lots clear.

Fixes: 5cd5a49c7213 ("LU-16321 osd: Allow fiemap on kernel buffers")
HPe-bug-id: LUS-11645
Signed-off-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Change-Id: I1c527ce653b8943801e5bf56fd172a5f05e22dfc
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51146
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-6142 lustre: use list_first/last_entry() for list heads 28/50828/5
Mr NeilBrown [Wed, 6 Nov 2019 22:47:35 +0000 (09:47 +1100)]
LU-6142 lustre: use list_first/last_entry() for list heads

This patch changes
    list_entry(foo.next, ...)
to
    list_first_entry(&foo, ...)
and
    list_entry(foo.prev, ...)
to
    list_last_entry(&foo, ...)

in cases where 'foo' is a list head - not a list member.

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I22b1278f5b481ce3074db3e59d37d9148016aed5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50828
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-9806 obdclass: wait for all exports to go 47/50147/11
Alex Zhuravlev [Mon, 27 Feb 2023 18:40:34 +0000 (21:40 +0300)]
LU-9806 obdclass: wait for all exports to go

obd_zombie_export_add() removes an export from the stale list
and then schedules a job to destroy that export. in this short
window ofd_fini()/mdt_fini() can find obd_linked_exports list
empty and no work in zombie work queue. then the obd is being
removed and concurrent export destroy may find the obd in a
unexpected state:
LustreError: 11166:0:(tgt_lastrcvd.c:469:tgt_client_free())
ASSERTION( lut && lut->lut_client_bitmap ) failed

use obd_stale_export_num counter to block in obd_zombie_barrier.

move atomic_inc() from class_unlink_export to obd_export_zombie_add()
as self-exports are not added to the stale list. I

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I62ed019f86becd3c66f5fcdf991f13cd47466e5e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50147
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-8853 nodemap: fix up lctl nodemap_info 68/23868/4
Kit Westneat [Fri, 18 Nov 2016 21:10:55 +0000 (16:10 -0500)]
LU-8853 nodemap: fix up lctl nodemap_info

When nodemap_info was moved over to using param_display, some of the
functionality got lost. This patch modifies nodemap_info to use
GET_PARAM instead of LIST_PARAM when a nodemap is specified.

Signed-off-by: Kit Westneat <kit.westneat@gmail.com>
Change-Id: I85df8e1cb43e002f4a112b9b671725862210dbec
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/23868
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
10 months agoLU-8582 tests: skip sanity/905 for old OSTs 68/51568/3
Andreas Dilger [Wed, 5 Jul 2023 00:58:07 +0000 (18:58 -0600)]
LU-8582 tests: skip sanity/905 for old OSTs

The fail_loc used in sanity test_905 does not exist in older OSTs.
Skip this subtest for older OSTs.

Fixes: 566edb8c43 ("LU-8582 target: send error reply on wrong opcode")
Test-Parameters: trivial testlist=sanity serverversion=2.12.9 env=ONLY=905
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I8fa50ec0f66afd9f24d562e0be57a416c04d8ba8
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51568
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Sarah Liu <sarah@whamcloud.com>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-16101 tests: skip sanity/27J for more kernels 67/51567/4
Andreas Dilger [Tue, 4 Jul 2023 23:08:03 +0000 (17:08 -0600)]
LU-16101 tests: skip sanity/27J for more kernels

This is a bug in the kernel that is not present in older kernels
before commit v5.11-10234-gcbd59c48ae2b (5.12), and is fixed with
commit v6.2-rc4-61-g5956592ce337 (6.2).

Move this from ALWAYS_EXCEPT (bug that needs to be fixed) to skip
(test that is known to fail in some configs but has been fixed).

Fixes: af6f49698a18 ("LU-16101 tests: add sanity/27J to always_except")
Test-Parameters: trivial testlist=sanity clientdistro=el9.2 env=ONLY=27J
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I8ec0a6d25a90e05672b039cd6c2b2fbf8a3ebbe5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51567
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-16937 utils: avoid lctl shmget() if not needed 26/51526/2
Andreas Dilger [Fri, 30 Jun 2023 19:41:23 +0000 (13:41 -0600)]
LU-16937 utils: avoid lctl shmget() if not needed

lctl is dynamically allocating an IPC shared memory segment
during every startup, even though it is only needed for a
small number of uncommon debug commands:

    shmget(IPC_PRIVATE, 65680, 0600)        = 196641
    shmat(196641, NULL, 0)                  = 0x7f752b1c5000
    shmctl(196641, IPC_RMID, NULL)          = 0

This setup can be moved to sub-commands that actually need it.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I41c790ce7cba2d9c48c1ec06eb23eb94aa548242
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51526
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-16868 tests: add check for replace_nid 24/51524/5
Alex Deiter [Fri, 30 Jun 2023 18:02:49 +0000 (22:02 +0400)]
LU-16868 tests: add check for replace_nid

Added check for replace_nid operations and return an
error to prevent module reload errors and timeout
when unmounting targets.

Test-Parameters: trivial
Test-Parameters: testlist=conf-sanity env=ONLY=32a serverdistro=el7.9
Test-Parameters: testlist=conf-sanity env=ONLY=32a serverdistro=el8.7
Signed-off-by: Alex Deiter <adeiter@tintri.com>
Change-Id: I29a5de826ac8f0040dd671e502d30bac4a082c43
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51524
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-16934 kernel: update RHEL 8.8 [4.18.0-477.15.1.el8_8] 17/51517/2
Jian Yu [Fri, 30 Jun 2023 10:59:31 +0000 (18:59 +0800)]
LU-16934 kernel: update RHEL 8.8 [4.18.0-477.15.1.el8_8]

Update RHEL 8.8 kernel to 4.18.0-477.15.1.el8_8.

Test-Parameters: trivial fstype=ldiskfs \
clientdistro=el8.8 serverdistro=el8.8 testlist=sanity

Test-Parameters: trivial fstype=zfs \
clientdistro=el8.8 serverdistro=el8.8 testlist=sanity

Change-Id: I66365dce63065a0a07958a182a3c705e9948d424
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51517
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-14301 client: use EOPNOTSUPP instead of ENOTSUPP 11/51511/3
Andreas Dilger [Fri, 30 Jun 2023 00:26:44 +0000 (18:26 -0600)]
LU-14301 client: use EOPNOTSUPP instead of ENOTSUPP

Don't return NFS-specific error code ENOTSUPP back to userspace,
instead use EOPNOTSUPP.  ENOTSUPP does not print a useful error
message from strerror() if it is hit by an application.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Iabd07b31069737e8ee7ca2382fd8cff6143ebbe5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51511
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: jsimmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
10 months agoLU-15626 tests: Remove variables not used 99/51499/6
Arshad Hussain [Wed, 28 Jun 2023 08:06:43 +0000 (13:36 +0530)]
LU-15626 tests: Remove variables not used

This patch removes variables which were
defined but not used. This was reported
by shellcheck.

This patch also replaces unicode double
quotes wherever applicable

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I996532de119806e20552e9bbe54b15615ea2c2e0
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51499
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-16929 tests: Fix syntax error under ha.sh 82/51482/2
Arshad Hussain [Wed, 28 Jun 2023 09:41:03 +0000 (15:11 +0530)]
LU-16929 tests: Fix syntax error under ha.sh

Fix bash syntax error under ha.sh/ha_lfsck_repaired

Test-Parameters: trivial
Fixes: 1a7c352e9 ("LU-11504 tests: trigger lfsck after/during failover/failback")
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: Ic31099a9438e1174013843156147cbb3bd98366a
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51482
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Elena <elena.gryaznova@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-8191 ptlrpc: add missing headers 80/51480/2
Timothy Day [Fri, 23 Jun 2023 20:49:08 +0000 (20:49 +0000)]
LU-8191 ptlrpc: add missing headers

Missing headers for several c-files in ptlrpc
cause functions to be incorrectly marked as only
being used within their respective c-files. This
patch adds those missing headers. It also addresses
a couple minor style issues.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Idd8fa747a671079aba2b691ef23cc7564e5e2430
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51480
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: jsimmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-8191 lustre: convert mdc,mdd,mdt,mgc functions to static 78/51478/2
Timothy Day [Fri, 23 Jun 2023 20:52:17 +0000 (20:52 +0000)]
LU-8191 lustre: convert mdc,mdd,mdt,mgc functions to static

Static analysis shows that a number of functions
could be made static. This patch declares several
functions in mdc, mdd, mdt, and mgc static.

Also, remove mgs_client_add() since it was unused, and
move a declaration from a c-file to the proper header file.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Ia23f62465c27c83a9a0260bb45e8c8b710491558
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51478
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: jsimmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-8191 lustre: convert osp,osd,osc,ofd functions to static 77/51477/2
Timothy Day [Fri, 23 Jun 2023 20:51:00 +0000 (20:51 +0000)]
LU-8191 lustre: convert osp,osd,osc,ofd functions to static

Static analysis shows that a number of functions
could be made static. This patch declares several
functions in osp, osd, osc, and ofd static.

Also, fix a few minor style issues.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I3d7af7ec0fa2978bfdd0cb490f18f485a78f81f6
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51477
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: jsimmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-8191 lustre: convert ec,fid,ldlm,quota functions to static 76/51476/4
Timothy Day [Fri, 23 Jun 2023 20:47:26 +0000 (20:47 +0000)]
LU-8191 lustre: convert ec,fid,ldlm,quota functions to static

Static analysis shows that a number of functions
could be made static. This patch declares several
functions in ec, fid, ldlm, and quota static.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Ic64bdf0d802fd4c963b7b7d3a654575ebde5c07d
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51476
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: jsimmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-8191 target: convert functions to static 75/51475/2
Timothy Day [Fri, 23 Jun 2023 20:41:36 +0000 (20:41 +0000)]
LU-8191 target: convert functions to static

Static analysis shows that a number of functions
could be made static. This patch declares several
functions in target static.

Also, remove an unused function tgt_obd_log_cancel(),
and add some headers where they were missing.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I1823df3562cb181b275788560166c92b63483637
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51475
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: jsimmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-8191 llite: convert functions to static 41/51441/2
Timothy Day [Fri, 23 Jun 2023 20:46:29 +0000 (20:46 +0000)]
LU-8191 llite: convert functions to static

Static analysis shows that a number of functions
could be made static. This patch declares several
functions in llite static.

Also, conserve more * in comments.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Iafa3bb84de158e31b27b7784243bc15e78187f10
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51441
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: jsimmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-8191 obdclass: add static and remove functions 40/51440/3
Timothy Day [Fri, 23 Jun 2023 20:40:35 +0000 (20:40 +0000)]
LU-8191 obdclass: add static and remove functions

Static analysis shows that a number of functions
could be made static. This patch declares several
functions in obdclass static.

There are a few functions which are never called
anywhere. These are removed. Additionally, there
is some debugging code (added 15 years ago) that
has also been removed.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I5f1d438c9663e62789d26093ec9bdd5d76a3060a
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51440
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: jsimmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-8191 lnet: remove unused, fix non-static functions 36/51436/2
Timothy Day [Fri, 23 Jun 2023 20:45:23 +0000 (20:45 +0000)]
LU-8191 lnet: remove unused, fix non-static functions

lnet_selftest_structure_assertion() and
lnet_net_is_pref_rtr_locked() are never called.
This patch removes both functions.

Static analysis shows that a number of functions
could be made static. This patch also declares
several functions in lnet static.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Ie1b49c5652553715cd9f96b56090d33a95e3b438
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51436
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: jsimmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-16904 test: Add sanity-compr.sh to run sanity and sanityn with PFL layout 71/51371/6
Andreas Dilger [Tue, 20 Jun 2023 18:57:56 +0000 (12:57 -0600)]
LU-16904 test: Add sanity-compr.sh to run sanity and sanityn with PFL layout

Add sanity-compr.sh to run sanity and sanityn with PFL layout
Also fix sanity subtests problem of 56ba,57b,65e,65g,65n,204e

Test-Parameters: trivial testlist=sanity-compr

Signed-off-by: Wei Liu <sarah@whamcloud.com>
Change-Id: Iefdc7757697629eb5c57d7694456249d62a2049e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51371
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-16667 build: kernel_cap_t contains u64 21/50421/9
Shaun Tancheff [Sun, 7 May 2023 02:11:27 +0000 (21:11 -0500)]
LU-16667 build: kernel_cap_t contains u64

linux kernel v6.2-13111-gf122a08b197d
  capability: just use a 'u64' instead of a 'u32[2]' array

Add configure test for kernel_cap_t as u64 and provide
and accessor for the least significant 32 bits.

As of linux commit v3.6-10973-g607ca46e97a1 lustre implicitly
started to ignore some capabilities, see:
   include/uapi/linux/capability.h

The last capability flag was added by:
   linux commit v5.8-rc5-1-g124ea650d307

The capabilities the Lustre currently ignores are:
 - CAP_MAC_OVERRIDE
 - CAP_MAC_ADMIN
 - CAP_SYSLOG
 - CAP_WAKE_ALARM
 - CAP_BLOCK_SUSPEND
 - CAP_AUDIT_READ
 - CAP_PERFMON
 - CAP_BPF
 - CAP_CHECKPOINT_RESTORE

None of which appear to be important to Lustre operations
and should be fine to continue ignore.

Test-Parameters: trivial
HPE-bug-id: LUS-11557
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I48ad7b1a34fff378c260dc73ea91b22aaa0d7469
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50421
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-13805 llite: Rename ldp_aio to sdio 70/50170/19
Patrick Farrell [Wed, 1 Mar 2023 17:54:11 +0000 (12:54 -0500)]
LU-13805 llite: Rename ldp_aio to sdio

ldp_aio is a weird name for a 'sub dio' struct - rename it
to sdio.

Test-parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ie6f75e420e4bf4069af36c802b51063f02981613
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50170
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
10 months agoLU-13805 llite: Convert allocate/get to use pvec 68/50168/21
Patrick Farrell [Wed, 1 Mar 2023 16:01:38 +0000 (11:01 -0500)]
LU-13805 llite: Convert allocate/get to use pvec

ll_allocate_dio_buffer and ll_get_user_pages both basically
work on the ll_dio_pages pvec, so this patch converts them
to do so explicitly - this makes it more obvious what
they're doing and their connection to the rest of the DIO
code.  This makes them less generic, but they have no other
users and seem unlikely to acquire any.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: If643d7684d89e0e0c81ee9d13f0d94f84ed87a56
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50168
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
10 months agoLU-13805 tests: add debug to aiocp 89/49989/22
Patrick Farrell [Tue, 14 Feb 2023 18:26:23 +0000 (13:26 -0500)]
LU-13805 tests: add debug to aiocp

Improve debug in aiocp.c.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I154036992a61b64b1753b15e47e64c01b630a5cb
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49989
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
10 months agoLU-16634 build: remove obsolete checkpatch strings 96/51596/2
Andreas Dilger [Thu, 6 Jul 2023 19:03:34 +0000 (13:03 -0600)]
LU-16634 build: remove obsolete checkpatch strings

Remove obsolete checkpatch.pl spelling.txt warnings.  These will
not normally be hit, but the mti_flags warning is occasionally a
false positive due to a duplicate struct name (which was originally
the reason for those fields to be renamed).

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I4ee16eb62cf0ed944ac604d181ce58beead8501d
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51596
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
10 months agoLU-15431 llite: skip fast reads if layout is invalid 82/46282/39
Alex Zhuravlev [Mon, 24 Jan 2022 13:10:03 +0000 (16:10 +0300)]
LU-15431 llite: skip fast reads if layout is invalid

don't let fast reads from the pagecache if the layout
is not valid.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Ie4357a184faf9a5d0e33804270d3cb0cb7e67bb7
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/46282
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-16637 llite: call truncate_inode_pages() in inode lock 57/50857/22
Bobi Jam [Thu, 4 May 2023 02:39:29 +0000 (10:39 +0800)]
LU-16637 llite: call truncate_inode_pages() in inode lock

In some cases vvp_prune()->truncate_inode_pages() is get called
without IO context, we need protect it with inode lock as well.

So we add ll_inode_info::lli_inode_lock_owner and set it according to
vfs lock rules (Documentation/filesystems/Locking or
Documentation/filesystems/locking.rst), so before calling
truncate_inode_pages(), we'd lock the inode if it's not locked in
vfs.

Fixes: ef9be34478 ("LU-16637 llite: call truncate_inode_pages() under inode lock")
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I84d7d999a49325810062a9a7337e184d35467820
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50857
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-16927 tests: improve sanity-quota 69/51469/4
Arshad Hussain [Tue, 27 Jun 2023 07:03:12 +0000 (12:33 +0530)]
LU-16927 tests: improve sanity-quota

DD variable already includes "if=/dev/zero"
and "bs=1M". This patch removes double
declaration of 'bs' and/or 'if' when called
with DD

This patch also changes direct 'dd' calls
to use $DD whenever applicable

Also for sanity-quota/52 fix error path variable
name.

Test-Parameters: trivial testlist=sanity-quota
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I9006c406f3b794ecdc37a451451538e5202a006a
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51469
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sergey Cheremencev <scherementsev@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-8191 tests: convert functions to static 33/51433/3
Timothy Day [Fri, 23 Jun 2023 20:39:12 +0000 (20:39 +0000)]
LU-8191 tests: convert functions to static

Static analysis shows that a number of functions
could be made static. This patch declares several
functions in various test helpers static.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I065fb4398ed1670ce6ad58cf946054f6bd1ec282
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51433
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-16925 osd-ldiskfs: Remove unused bio_integrity_enabled 32/51432/2
Shaun Tancheff [Sat, 24 Jun 2023 05:08:48 +0000 (12:08 +0700)]
LU-16925 osd-ldiskfs: Remove unused bio_integrity_enabled

bio_integrity_enabled() is not used in lustre.

Remove the configure check and the associated code.

Test-Parameters: trivial
HPE-bug-id: LUS-10118
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I9d07333b91210a2f6545945cf48293179a71258e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51432
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-16922 kernel: update RHEL 9.2 [5.14.0-284.18.1.el9_2] 10/51410/2
Jian Yu [Thu, 22 Jun 2023 06:56:51 +0000 (14:56 +0800)]
LU-16922 kernel: update RHEL 9.2 [5.14.0-284.18.1.el9_2]

Update RHEL 9.2 kernel to 5.14.0-284.18.1.el9_2.

Test-Parameters: trivial env=SANITY_EXCEPT=27J fstype=ldiskfs \
clientdistro=el9.2 serverdistro=el9.2 testlist=sanity

Test-Parameters: trivial env=SANITY_EXCEPT=27J fstype=zfs \
clientdistro=el9.2 serverdistro=el9.2 testlist=sanity

Change-Id: Ifa8f13200550e5f473b7d7d641155e349c453c03
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51410
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Colin Faber <cfaber@ddn.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-8191 mdt: convert functions to static 45/51345/2
Timothy Day [Sun, 18 Jun 2023 04:26:48 +0000 (04:26 +0000)]
LU-8191 mdt: convert functions to static

Static analysis shows that a number of functions
could be made static. This patch declares several
functions in mdt_coordinator.c static.

 mdt_coordinator.c:2145:9: warning: Should this function be static?
 ssize_t loop_period_show(struct kobject *kobj,
         ^

Further patches will follow to clean up the
remaining non-static functions in other subsystems.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I0350b0d5c88c0a8d1f1748d1d429cdf90afb96b7
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51345
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-16548 lnet: Fixing missing gnilnd define CURRENT_LND_VERSION 42/51342/3
Frank Sehr [Fri, 16 Jun 2023 19:36:33 +0000 (12:36 -0700)]
LU-16548 lnet: Fixing missing gnilnd define CURRENT_LND_VERSION

Added missing define CURRENT_LND_VERSION for gni.
Declared kgnilnd_tunables_setup.

Test-Parameters: trivial
Signed-off-by: Frank Sehr <fsehr@whamcloud.com>
Change-Id: Ia327dcbdaa518a24a60e32b1dcb37c5b1d0dc78e
Fixes: 56097c4904 ("LU-16548 lnet: report actual timeout used by lnd")
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51342
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
10 months agoLU-16723 parser: fix help hanging 39/51339/2
Timothy Day [Fri, 16 Jun 2023 04:58:42 +0000 (04:58 +0000)]
LU-16723 parser: fix help hanging

Running a command such as 'lctl pcc help v' will
hang indefinitely. This is due to a bug in find_cmd,
return pointers from two different arrays.

This is fixed by setting top_level to be the
concatenation of the override_cmdlist and the
regular cmds.

Also, rather than recursing forever, give up after
reaching an arbitrary depth.

Improve several help messages so that users will
have a better idea why their commands aren't
working.

Fixes: 21080400f9 ("LU-16723 libcfs: refactor parser to be simpler")
Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Ib22bc71e5952b1beb868bbe37bc8f6b08c94ff72
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51339
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-16890 obd: OBD_FREE_PRE() to ignore NULL pointers 32/51332/7
Arshad Hussain [Thu, 15 Jun 2023 11:10:54 +0000 (16:40 +0530)]
LU-16890 obd: OBD_FREE_PRE() to ignore NULL pointers

This patch modifies OBD_FREE_PRE() to not LASSERT
when null pointer is passed as the kfree() function
accepts NULL as no-op.

This is first set of patch that modifies the definitions.
Subsequent set will modify the callers to accept this
case.

As as example:

This caller will now be change to:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
char *test;
OBD_ALLOC(test, 32);
...
OBD_FREE(test, 32);

Previously the caller used to be:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
char *test;
OBD_ALLOC(test, 32);
...
if (test)
OBD_FREE(test, 32);

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I735c8210e30f58da19ede4c87c07186108b35b99
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51332
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-16899 gnilnd: Use libcfs_nidstr and fix typo 30/51330/3
Shaun Tancheff [Sun, 18 Jun 2023 02:39:00 +0000 (09:39 +0700)]
LU-16899 gnilnd: Use libcfs_nidstr and fix typo

CDEBUG() in kgnilnd_peer_notify() should use libcfs_nidstr()

kgnilnd_finish_connect() has a typo in lnet_notify() where
peer_nid was intended.

Test-Parameters: trivial
Fixes: 4a88236f40 ("LU-10391 lnet: change lnet_notify() to take struct lnet_nid")
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: If45a28b654e27aa34b655fefaea142dc740fa46f
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51330
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-16898 osd-ldiskfs: do not return dr_error from past RPC 20/51320/7
Andrew Perepechko [Wed, 14 Jun 2023 16:33:43 +0000 (19:33 +0300)]
LU-16898 osd-ldiskfs: do not return dr_error from past RPC

dr_error was cleared in osd_init_iobuf() only before handling new
read/write RPCs, so a later non-read/write RPC handled by that thread
would return the stale dr_error value from the last read/write RPC.

Always clear dr_error in osd_trans_stop->osd_fini_iobuf() after it
is checked, so that it cannot affect later RPCs.

Change-Id: Idbeab67edc66b58e9869b67640693c7f1dd9d6f2
Signed-off-by: Andrew Perepechko <andrew.perepechko@hpe.com>
HPE-bug-id: LUS-11682
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51320
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
10 months agoLU-16518 obd: fix style and clang error 11/51311/3
Timothy Day [Wed, 14 Jun 2023 02:07:52 +0000 (02:07 +0000)]
LU-16518 obd: fix style and clang error

Tabify the remaining code in this file and clean
up some of the comments. Add SPDX text.

Remove a function which is never used.

Fix the style of and inline cl_io_invariant.

Conserve a significant number of * so that they
can be repurposed in other comments.

Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I6b1c5c700d7e6f13c8c57726db5da6595b9c060a
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51311
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-16796 libcfs: Remove reference to LASSERT_ATOMIC_GT 89/51189/4
Arshad Hussain [Thu, 1 Jun 2023 07:33:01 +0000 (13:03 +0530)]
LU-16796 libcfs: Remove reference to LASSERT_ATOMIC_GT

This patch removes all reference to LASSERT_ATOMIC_GT macro.

Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I7978ceb495c3e03153843439109d48d47bba1e2a
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51189
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
10 months agoLU-16846 nrs: Fix console messages 21/51121/6
Etienne AUJAMES [Wed, 24 May 2023 12:35:29 +0000 (14:35 +0200)]
LU-16846 nrs: Fix console messages

Fix format of console messages and missing end-of-line.

CERROR("%s.%d NRS: ....", service_name, cpt, ...);

Test-Parameters: trivial
Fixes: c098c09 ("LU-14976 nrs: change nrs policies at run time")
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Change-Id: Ib447673c69bcc853ebd1479463ca79bd5aa59964
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51121
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-16842 fsx: tolerate delete last non-stale mirror error 90/51090/3
Bobi Jam [Tue, 23 May 2023 03:11:37 +0000 (11:11 +0800)]
LU-16842 fsx: tolerate delete last non-stale mirror error

fsx mirror split test could try to delete the last non-stale mirror
of a file and that's a tolerable error scenario. The fsx FLR test
randomly choose a mirror operation and this situation could happen.

Test-Parameters: trivial testlist=sanity-flr env=ONLY=70a
Fixes: 04ab0cc869c (LU-14156 utils: mirror split to check for last in-sync early)
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I80c294da80740b21e00ae72a092fd8883ec7d60e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51090
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-12019 build: Recognize Debian Kernel and set KMP dir 66/51066/6
Thomas Stibor [Tue, 13 Jun 2023 17:34:39 +0000 (13:34 -0400)]
LU-12019 build: Recognize Debian Kernel and set KMP dir

Recognize Debian kernel and make sure kernel module package (KMP)
directory matches with KMP_MODDIR of Ubuntu and the Debian building
package system.

Change-Id: Ia3570500ed538c5d3c7a002eafddfc715efbf580
Test-Parameters: trivial clientdistro=ubuntu2204
Signed-off-by: Thomas Stibor <t.stibor@gsi.de>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51066
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Thomas Stibor <thomas@stibor.net>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-16805 llite: improve readpage debug 92/50892/2
Patrick Farrell [Mon, 8 May 2023 21:44:11 +0000 (17:44 -0400)]
LU-16805 llite: improve readpage debug

LU-16412 (which is a workaround for a kernel bug) added a
debug message in ll_readpage(), but this message is printed
every time rather than only when the kernel bug is hit.

Let's fix this.

Fixes: 209afbe28b "LU-16412 llite: check truncated page in ->readpage()"
Test-parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ice02178eb9c07e03b58fb4e2d64ed3ea878cf137
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50892
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
10 months agoLU-16697 llite: Set BDI_CAP_* flags for lustre 97/50497/8
Shaun Tancheff [Sat, 1 Apr 2023 08:41:16 +0000 (03:41 -0500)]
LU-16697 llite: Set BDI_CAP_* flags for lustre

Lustre should set the BDI_CAP_* flags and the s_iflags
to indicate support for write back and cgroup write back

HPE-bug-id: LUS-11553
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I49ce07fce8a9d153b9a71d8a0ba28b799354fc7f
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50497
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
10 months agoLU-16691 ldiskfs: limit length of per-inode prealloc list 81/50481/15
Alex Zhuravlev [Fri, 31 Mar 2023 05:41:07 +0000 (08:41 +0300)]
LU-16691 ldiskfs: limit length of per-inode prealloc list

In the scenario of writing sparse files, the per-inode prealloc list may
be very long, resulting in high overhead for ext4_mb_use_preallocated().
To circumvent this problem, we limit the maximum length of per-inode
prealloc list to 512 and allow users to modify it.

After patching, we observed that the sys ratio of cpu has dropped, and
the system throughput has increased significantly. We created a process
to write the sparse file, and the running time of the process on the
fixed kernel was significantly reduced, as follows:

Running time on unfixed kernel:
    # time taskset 0x01 ./sparse /data1/sparce.dat
    real    0m2.051s
    user    0m0.008s
    sys     0m2.026s

Running time on fixed kernel:
    # time taskset 0x01 ./sparse /data1/sparce.dat
    real    0m0.471s
    user    0m0.004s
    sys     0m0.395s

Link: https://lore.kernel.org/r/d7a98178-056b-6db5-6bce-4ead23f4a257@gmail.com
Signed-off-by: Chunguang Xu <brookxu@tencent.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I5e4ea3acfc07f6e69890690211bf6a34c1230979
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50481
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sergey Cheremencev <scherementsev@ddn.com>
10 months agoLU-16651 llite: hold invalidate_lock when invalidate cache pages 71/50371/4
Qian Yingjin [Tue, 21 Mar 2023 08:53:00 +0000 (04:53 -0400)]
LU-16651 llite: hold invalidate_lock when invalidate cache pages

The newer kernel (such as Ubuntu 2204) introduces a new member:
invalidate_lock in the structure @address_space.
The filesystem must exclusively acquire invalidate_lock before
invalidating page cache in truncate / hole punch (and thus calling
into ->invalidatepage) to block races between page cache
invalidation and page cache filling functions (fault, read, ...)

However, current Lustre client does not hold this lock when remove
pages from page cache caused by the revocation of the extent DLM
lock protecting them.
If a client has two overlapped PR DLM extent locks, i.e:
- L1 = <PR, [1M, 4M - 1]
- L2 = <PR, [3M, 5M - 1]
A reader process holds L1 and reads data in range [3M, 4M - 1].
L2 is being revoken due to the conflict access.
Then the page read-in by the reader may be invalidated and deleted
from page cache by the revocation of L2 (in lock blocking AST).

The older kernel will check each page after read whether it was
invalidated and deleted from page cache. If so, it will retry the
page read.

In the newer kernel, it removes this check and retry.
Instead, it introduces a new rw_semaphore in the address_space -
invalidate_lock - that holding the shared lock to protect adding
of pages to page cache for page faults / reads / readahead, and
the exclusive lock to protect invalidating pages, removing them
from page cache for truncate / hole punch.

Thus, in this patch it holds exclusive invalidate_lock in newer
kernels when remove pages from page cache caused by the revocation
of a extent DLM lock protecting them. Otherwsie, it will result in
-EIO error or partial reads in the new added test case sanity/833.

Test-parameters: clientdistro=ubuntu2204 testlist=sanity env=ONLY=833,ONLY_REPEAT=10
Change-Id: If3a27002b89636b9fd4d7b5ea50afa9aeac5d121
Signed-off-by: Qian Yingjin <qian@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50371
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>