Whamcloud - gitweb
fs/lustre-release.git
21 months agoLU-15288 lnet: increase transaction timeout 80/45780/3
Cyril Bordage [Tue, 7 Dec 2021 22:14:43 +0000 (23:14 +0100)]
LU-15288 lnet: increase transaction timeout

In LU-13145, it was decided to increase default transaction timeout
(LNET_TRANSACTION_TIMEOUT_DEFAULT) to 150s. But, in the associated
patch, it was set to 50s. This modification will also modify
lnd_timeout (from 16 to 49).

Signed-off-by: Cyril Bordage <cbordage@whamcloud.com>
Change-Id: I13a8b5d14230bb6e8936cb3e18540f19dbc62985
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/45780
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
21 months agoLU-16321 osd: Allow fiemap on kernel buffers 90/49190/7
Shaun Tancheff [Fri, 2 Dec 2022 10:19:59 +0000 (04:19 -0600)]
LU-16321 osd: Allow fiemap on kernel buffers

Linux commit v5.17-rc3-19-g967747bbc084
  uaccess: remove CONFIG_SET_FS

When KERNEL_DS gone lustre needs an alternative for fiemap to
copy extents to kernel space memory.

Direct in-kernel calls to inode->f_ops->fiemap() can utilize
an otherwise unused flag on fiemap_extent_info fi_flags
to indicate the fiemap extent buffer is allocated in kernel space.

Include ldiskfs patches for ldiskfs_fiemap() to
define EXT4_FIEMAP_FLAG_MEMCPY and utilize it.

HPE-bug-id: LUS-11337
Fixes: d0337cab8e ("LU-14195 osd: don't use set_fs() for ->fiemap() calls.")
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I7a8edb481833fd1bdcf7b6cd6e08397c1754baee
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49190
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-14645 tests: test lfs setdirstripe with '/$' 63/49463/2
Jian Yu [Tue, 20 Dec 2022 20:24:25 +0000 (12:24 -0800)]
LU-14645 tests: test lfs setdirstripe with '/$'

This patch improves one of the lfs setdirstripe tests to
verify that dir name ending with '/' also works.

Test-Parameters: trivial mdscount=2 mdtcount=4 \
env=ONLY=24B testlist=sanity

Change-Id: I237d5a9ebad42cc0569aa1db487d0df147372316
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49463
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-16373 tests: failover mds1 back to the primary server 45/49345/2
Jian Yu [Thu, 8 Dec 2022 07:56:36 +0000 (23:56 -0800)]
LU-16373 tests: failover mds1 back to the primary server

This patch fixes recovery-small test 144a to failover
mds1 back to the primary server so that stack_trap can
set timeout parameter on the correct mds node.

Test-Parameters: trivial \
env=SLOW=yes,FAILURE_MODE=HARD,ONLY=144a \
clientcount=4 mdtcount=1 mdscount=2 osscount=2 \
austeroptions=-R failover=true iscsi=1 \
testlist=recovery-small

Change-Id: Idbfdb7b084c7edac8784008e0455f76632aa685b
Test-Parameters: trivial testlist=recovery-small
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49345
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Sarah Liu <sarah@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-16433 llite: check vvp_account_page_dirtied 12/49512/6
Jian Yu [Thu, 29 Dec 2022 08:21:32 +0000 (00:21 -0800)]
LU-16433 llite: check vvp_account_page_dirtied

This patch removes duplicated codes from vvp_set_pagevec_dirty()
and check vvp_account_page_dirtied to determine if falling back
to call __set_page_dirty_nobuffers().

HAVE_ACCOUNT_PAGE_DIRTIED_EXPORT also needs to be checked because
vvp_account_page_dirtied is not defined if account_page_dirtied
is exported.

Test-Parameters: trivial clientdistro=el8.6 testlist=sanity

Test-Parameters: trivial clientdistro=el8.7 testlist=sanity

Test-Parameters: trivial clientdistro=el9.0 \
env=SANITY_EXCEPT="130 244a" testlist=sanity

Test-Parameters: trivial clientdistro=sles15sp4 \
env=SANITY_EXCEPT="27J 101j 244a" testlist=sanity

Change-Id: I272033d7494a157145224b1b8ce999a80958aa6c
Fixes: 4bf090b811 ("LU-15959 kernel: new kernel [SLES15 SP4 5.14.21-150400.24.18.1]")
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49512
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Shuichi Ihara <sihara@ddn.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
21 months agoLU-16120 build: Add support for kobj_type default_groups 65/48365/12
Shaun Tancheff [Fri, 16 Dec 2022 09:41:54 +0000 (03:41 -0600)]
LU-16120 build: Add support for kobj_type default_groups

Linux commit v5.1-rc3-29-gaa30f47cf666
  kobject: Add support for default attribute groups to kobj_type

Linux commit v5.18-rc1-2-gcdb4f26a63c3
  kobject: kobj_type: remove default_attrs

Switch to using kobj_type default_groups when it is available.
Provide support for default_attrs for older kernels.

HPE-bug-id: LUS-11196
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I43b03c67c22307293a2abc444aa1a73889ca09ee
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48365
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-16297 ptlrpc: don't panic during reconnection 29/49029/9
Alexander Boyko [Thu, 3 Nov 2022 11:23:20 +0000 (07:23 -0400)]
LU-16297 ptlrpc: don't panic during reconnection

ptlrpc_send_rpc() could race with ptlrpc_connect_import_locked()
in the middle of assertion check and this leads to a wrong panic.
Assertion checks

(AT_OFF || imp->imp_state != LUSTRE_IMP_FULL ||

reconnect changes import state and flags
and second part

(imp->imp_msghdr_flags & MSGHDR_AT_SUPPORT) ||
!(imp->imp_connect_data.ocd_connect_flags & OBD_CONNECT_AT)))

MSGHDR_AT_SUPPORT is disabled during client reconnection.
It is not good to use locking at this hot part, so fix changes
assertion to a report.

HPE-bug-id: LUS-10985
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: Ifc9e413c679c3e8a4c8f4f541251bebabae41c82
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49029
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-15935 tests: add version check to replay-dual test_33 98/49398/3
Jian Yu [Wed, 14 Dec 2022 02:31:05 +0000 (18:31 -0800)]
LU-15935 tests: add version check to replay-dual test_33

This patch adds MDS version check to replay-dual test_33
to avoid interop test failure.

Test-Parameters: trivial \
serverjob=lustre-b2_15 serverbuildno=28 \
env=ONLY=33 testlist=replay-dual

Test-Parameters: trivial env=ONLY=33 testlist=replay-dual

Change-Id: I3ec665302a431d3c0f07bc819a08237dbc5b4309
Fixes: 1a79d395dd ("LU-15935 target: keep track of multirpc slots in last_rcvd")
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49398
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
21 months agoLU-8367 osp: wait for precreate on reformatted OST 51/49151/6
Li Dongyang [Mon, 14 Nov 2022 13:28:37 +0000 (00:28 +1100)]
LU-8367 osp: wait for precreate on reformatted OST

We should wait for precreate rpc to finish when we see a just
reformatted/replaced OST, otherwise the client could try
to access the object on OST before it's created.

Do not use sync_trans when recreating the objects on the
reformatted/replaced OST.

Fix detecting reformatted OST for FID_SEQ_NORMAL, for such
seqs the oid will be initialized as LUSTRE_FID_INIT_OID,
which is 1.

Change-Id: I4aebb9d573aa352dd7897e5f1129dc2117a084bb
Fixes: 63e17799a3 ("LU-8367 osp: enable replay for precreation request")
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49151
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-15921 tests: fix sanity-hsm 24c 64/47564/2
Aurelien Degremont [Wed, 8 Jun 2022 07:49:32 +0000 (07:49 +0000)]
LU-15921 tests: fix sanity-hsm 24c

Fix bad copy-paste in test sanity-hsm 24c causing
the test to save 3 different tunables, but actually
restoring the same one three times.

Also improve the code to support values including spaces.

Test-Parameters: trivial testlist=sanity-hsm,sanity-pcc
Fixes: 2042bce ("LU-9474 tests: rewrite copytool_setup to use stack_trap")
Fixes: f172b11 ("LU-10092 llite: Add persistent cache on client")
Change-Id: I34cc61515ebb862d5996f41cdb2055ac53ccac65
Signed-off-by: Aurelien Degremont <degremoa@amazon.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/47564
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
21 months agoLU-16277 lnet: fix bad parameter in LUTF 85/48985/2
Cyril Bordage [Mon, 31 Oct 2022 08:57:10 +0000 (09:57 +0100)]
LU-16277 lnet: fix bad parameter in LUTF

In SimpleLustreNode, exception parameter is not passed to BaseTest
that leads to this parameter not used when using remote agent.

Test-Parameters: @lnet
Signed-off-by: Cyril Bordage <cbordage@whamcloud.com>
Change-Id: Ie458ef4a41dc059da8f069d8d62d365c21c9f25d
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48985
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-16384 tests: dump lustre log if DEBUG_RMMOD set 74/49374/5
Alex Zhuravlev [Mon, 12 Dec 2022 09:52:16 +0000 (12:52 +0300)]
LU-16384 tests: dump lustre log if DEBUG_RMMOD set

just to simplify local development and use existing code in
lustre_rmmod script:
DEBUG_RMMOD=<logfile> sh sanity.sh will dump a text lustre log to <logfile>.
it can be DEBUG_RMMOD=- to direct lustre log to standard output.

Test-Parameters: trivial
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I8d72e1e9cecb354bcc5d41ab3cca5767a298c668
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49374
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-15626 tests: Fix "error" reported by shellcheck (3/5) 37/49437/2
Arshad Hussain [Wed, 22 Jun 2022 12:25:56 +0000 (17:55 +0530)]
LU-15626 tests: Fix "error" reported by shellcheck (3/5)

This patch fixes "error" issues reported by shellcheck
for file lustre/tests/test-framework.sh. This patch also
moves spaces to tabs.

Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I5c802e268e68edc118d89d86063a23bedf972013
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49437
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Aurelien Degremont <degremoa@amazon.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-16386 utils: Improve mkfs.lustre.8 man page 84/49384/2
Arshad Hussain [Tue, 13 Dec 2022 05:23:26 +0000 (10:53 +0530)]
LU-16386 utils: Improve mkfs.lustre.8 man page

This patch imporves the
- Options section of "--version" argument
- Adds "--version" option to examples section

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I7fd3e7f1ea9a313a33db5620a92a595f2c4bd36f
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49384
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
21 months agoLU-16348 tests: export TESTLOG_PREFIX and TESTNAME to rpc.sh 60/49260/4
Jian Yu [Tue, 20 Dec 2022 04:50:59 +0000 (20:50 -0800)]
LU-16348 tests: export TESTLOG_PREFIX and TESTNAME to rpc.sh

In Lustre test suites, while running do_rpc_nodes, if the
remote function failed and error() was called,
then gather_logs() can not gather logs with a correct
prefix name because TESTLOG_PREFIX and TESTNAME variables
were not exported to rpc.sh.

Test-Parameters: trivial testlist=sanity,conf-sanity
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Change-Id: I2bdbca7f1886f376160a87293ef367f3a4a59f86
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49260
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-15626 tests: Fix "error" reported by shellcheck (4/5) 38/49438/2
Arshad Hussain [Wed, 22 Jun 2022 12:47:30 +0000 (18:17 +0530)]
LU-15626 tests: Fix "error" reported by shellcheck (4/5)

This patch fixes "error" issues reported by shellcheck
for file lustre/tests/test-framework.sh. This patch also
moves spaces to tabs.

Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I84b43cba5b50d6618bee756d2f3c7f59ab0d74da
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49438
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-16271 ptlrpc: fix eviction right after recovery 57/49257/4
Alexander Boyko [Mon, 28 Nov 2022 14:20:05 +0000 (09:20 -0500)]
LU-16271 ptlrpc: fix eviction right after recovery

When recovery is finished exports could be timedout since
recovery thread waits stale clients, and no more requests
come after final ping. This was handled as exports timers update
after final ping processing. LU-16002 introduced fast evictions
and brings error - eviction right after recovery.
Process exports timers updates before obd_recovering is cleared.

Fixes: 6bdeda7afe ("LU-16002 ptlrpc: reduce pinger eviction time")
Test-Parameters: testlist=replay-single env=ONLY=89,ONLY_REPEAT=20
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: Ibf3b2f632d6d3aa1de57038fdecbec38cf9a97cf
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49257
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
22 months agoLU-16434 tests: replace '-m' with '-i' in sanity/230j 23/49523/3
Jian Yu [Wed, 28 Dec 2022 18:08:20 +0000 (10:08 -0800)]
LU-16434 tests: replace '-m' with '-i' in sanity/230j

In lfs_setdirstripe(), '-m' was originally used for '--mode'.
Fix sanity test_230j to replace '-m 0' with '-i 0' to force
directory creation on MDT0000 as the test expected.

Test-Parameters: trivial mdscount=2 mdtcount=4 \
env=ONLY=230j testlist=sanity

Change-Id: I10d435719f4b29ec47fa06c478caee9fcc8134a5
Fixes: 8deea7888c ("LU-11508 mdt: reject DoM file migration")
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49523
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
22 months agoNew tag 2.15.53 2.15.53 v2_15_53
Oleg Drokin [Sat, 24 Dec 2022 03:47:34 +0000 (22:47 -0500)]
New tag 2.15.53

Change-Id: I93c2e581fd13b3d233030ce3b178c23059276b01
Signed-off-by: Oleg Drokin <green@whamcloud.com>
22 months agoLU-16378 lnet: handles unregister/register events 75/49375/4
Cyril Bordage [Sat, 10 Dec 2022 00:51:16 +0000 (01:51 +0100)]
LU-16378 lnet: handles unregister/register events

When network is restarted, devices are unregistered and then
registered again. When a device registers using an index that is
different from the previous one (before network was restarted), LNet
ignores it. Consequently, this device stays with link in fatal state.

To fix that, we catch unregistering events to clear the saved index
value, and when a registering event comes, we save the new value.

Signed-off-by: Cyril Bordage <cbordage@whamcloud.com>
Change-Id: I17e93a1103d588f3e630a9c7446b345f4d472b97
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49375
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
22 months agoLU-16335 build: remove _GNU_SOURCE dependency in lustre_user.h 28/49328/8
Lai Siyao [Thu, 1 Dec 2022 08:17:00 +0000 (03:17 -0500)]
LU-16335 build: remove _GNU_SOURCE dependency in lustre_user.h

The lustre_user.h header uses the non-standard strchrnul() function
in userspace.  This will always leads to LC_IOC_REMOVE_ENTRY configure
check to fail, and in the end "lfs rm_entry" always returns -ENOTSUP.

Implement an alternative approach to avoid external dependencies on
the lustre_user.h header.  Also, LC_IOC_REMOVE_ENTRY is itself
unnecessary, the code can check for LL_IOC_REMOVE_ENTRY directly.

Replace the NFS-specific -ENOTSUP error return code with -EOPNOTSUPP.

Fix the compile test_400[ab] checks to not use "-std=c99" to verify
that the uapi headers are usable without this dependency.

Fixes: b59835f8b6 ("LU-13903 utils: have liblustreapi support Linux client")
Fixes: 7a7309fa84 ("LU-13274 uapi: make lustre UAPI headers C99 compliant")
Fixes: 6331eadbd6 ("LU-15420 uapi: avoid gcc-11 -Werror=stringop-overread")
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: If42743a2148c317b8a9b701ceb5d08bac5149f5f
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49328
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
22 months agoLU-16390 tests: check Lustre filefrag in sanity-flr/49a 86/49386/2
Andreas Dilger [Tue, 13 Dec 2022 07:01:06 +0000 (00:01 -0700)]
LU-16390 tests: check Lustre filefrag in sanity-flr/49a

Check that a Lustre-patched filefrag is installed when running
sanity-flr test_49a.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ic909ea4ca160d47480004f53a96ce7539ce5076c
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49386
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Colin Faber <cfaber@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
22 months agoLU-16386 mkfs: Handle --version argument correctly 79/49379/3
Arshad Hussain [Mon, 12 Dec 2022 14:42:44 +0000 (09:42 -0500)]
LU-16386 mkfs: Handle --version argument correctly

Running mkfs.lustre with --version or -V argument
fails instead of printing the version. This patch
fixes the error.

Without patch:
--------------
$ ./lustre/utils/mkfs.lustre --version
usage: mkfs.lustre <target type> [--backfstype=ldiskfs]
<snip>

With patch:
-----------
$ ./lustre/utils/mkfs.lustre --version
mkfs.lustre 2.15.52_175_ge7aa83d

Test-Parameters: trivial fstype=zfs testlist=sanity
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I4d4d1144d669fce8b02e9f8c3fb5f45f68b337b4
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49379
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
22 months agoLU-14073 ldiskfs: don't test LDISKFS_IOC_FSSETXATTR 53/49353/4
Mr NeilBrown [Fri, 9 Dec 2022 05:31:13 +0000 (16:31 +1100)]
LU-14073 ldiskfs: don't test LDISKFS_IOC_FSSETXATTR

EXT4_IOC_FSSETXATTR was removed upstream in Linux 5.9, Commit
cb29a02d3a9d ("ext4: use generic names for generic ioctls").
So we cannot use it to test if project quotas are supported.

Instead test if EXT4_MAXQUOTAS is 3.  This was changed to 3 upstream
in the commit immediately before EXT4_IOC_FSSETXATTR was added, so it
is effectively the same test.

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I88c51c03959ebe98cd5066596f5158fac570a625
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49353
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
22 months agoLU-16376 obdclass: NUL terminate long jobid strings 51/49351/3
Andreas Dilger [Thu, 8 Dec 2022 18:43:57 +0000 (11:43 -0700)]
LU-16376 obdclass: NUL terminate long jobid strings

It appears that some jobid names can be sent that are using the full
32-byte size, rather than containing an embedded NUL terminator. This
caused errors in lprocfs_job_stats_log() when it overflowed.

If there is no NUL terminator in lustre_msg_get_jobid() then add one
if not found within the buffer, so that the rest of the code doesn't
have to deal with unterminated strings.

This potentially exposes a larger issue that other places may not be
handling the unterminated string properly either, which needs to be
addressed separately on both the client and server.  Terminating the
jobid to 31 chars only on the client does not totally solve the issue,
since there will still be older clients that are not doing this, so
the server needs to handle this in any case.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I4c05fabdacb6a0bbf6477d3601a628fe1f3ebbe5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49351
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Feng Lei <flei@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
22 months agoLU-14707 tests: Bashify more scripts for Ubuntu et. al. 96/49296/3
Timothy Day [Thu, 1 Dec 2022 19:18:31 +0000 (19:18 +0000)]
LU-14707 tests: Bashify more scripts for Ubuntu et. al.

Some scripts that are not POSIX sh are being
invoked using sh. The scripts should be called
using the shell listed in the shebang.

Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I7233ce56df95a5b8698b39872e6118a4fa1a029a
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49296
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Aurelien Degremont <degremoa@amazon.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
22 months agoLU-15014 osc: Fix possible null pointer 75/44975/3
Patrick Farrell [Thu, 6 Oct 2022 11:40:41 +0000 (07:40 -0400)]
LU-15014 osc: Fix possible null pointer

Change init to fix possible null pointer access.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Id1bee8b5ea5fb92a8831992ad44c487c69d52e1e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/44975
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
22 months agoLU-16231 misc: rename lprocfs_stats functions 47/48847/3
Andreas Dilger [Thu, 13 Oct 2022 06:05:04 +0000 (00:05 -0600)]
LU-16231 misc: rename lprocfs_stats functions

Rename lprocfs_{alloc,register,clear,free}_stats() to be
lprocfs_stats_*() so these functions can be found more easily
in relation to struct lprocfs_stats.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I671284a86ee2a1fd3c58da75923f9467e72540e5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48847
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Ellis Wilson <elliswilson@microsoft.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
22 months agoLU-16157 lnet: lst read-outside of allocation 47/48547/3
Alexey Lyashkov [Wed, 14 Sep 2022 19:59:11 +0000 (22:59 +0300)]
LU-16157 lnet: lst read-outside of allocation

lnet_selftest want a some parameters from userspace,
but it never sends. It caused a read of outside of allocation
like
BUG: KASAN: slab-out-of-bounds in lstcon_testrpc_prep+0x19e7/0x1bb0
Read of size 4 at addr ffff8888bbaa866c by task lt-lst/6371

Signed-off-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Change-Id: I2a98e60c4be65c49fa9da4b418e50f1c7309b69d
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48547
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
22 months agoLU-16366 build: Add LCME_FL_PARITY to wirecheck 11/49311/2
Shaun Tancheff [Mon, 5 Dec 2022 04:39:03 +0000 (22:39 -0600)]
LU-16366 build: Add LCME_FL_PARITY to wirecheck

 - OBD_MD_DOM_SIZE: Should use 0x instead of 0X for consistency.
 - LCME_FL_PARITY should be included in wirecheck and wiretest
 - QIF_DQBLKSIZE_BITS used where QIF_DQBLKSIZE is expected

Test-Parameters: trivial
Fixes: 4c47900889 ("LU-12186 ec: add necessary structure member for EC file")
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Ic2eecfc2b1945b5b249bb341f791a99c5b109b97
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49311
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
22 months agoLU-16364 llite: Move d_u.d_alias compat define 08/49308/2
Shaun Tancheff [Fri, 2 Dec 2022 19:40:38 +0000 (13:40 -0600)]
LU-16364 llite: Move d_u.d_alias compat define

Breaks zpl_d_drop_aliases (seen in 2.1.7)

The only user of d_alias is llite so move the
define to a header private to llite.

Test-Parameters: trivial
HPE-bug-id: LUS-11394
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I44f511073f4dd17fd6dba1588e88d29cdfd3f6cb
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49308
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
22 months agoLU-16363 build: fiemap flexible array 05/49305/5
Shaun Tancheff [Mon, 5 Dec 2022 04:32:35 +0000 (22:32 -0600)]
LU-16363 build: fiemap flexible array

Linux commit v5.19-rc2-1-g94dfc73e7cf4
 treewide: uapi: Replace zero-length arrays with flexible-array
 members
Adjust wiretest to handle flexible array when
sizeof(fiemap->fm_extents) is undefined.

Test-Parameters: trivial
HPE-bug-id: LUS-11388
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Ia2692d126a871b43e9144e5d151215166604702d
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49305
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
22 months agoLU-16359 build: RHEL use Module.symvers during find-provides 02/49302/4
Shaun Tancheff [Fri, 2 Dec 2022 14:46:19 +0000 (08:46 -0600)]
LU-16359 build: RHEL use Module.symvers during find-provides

find-provides fails to find module versions on newer
kernels.

The generated Module.symvers is always generated and
correct. Install it to the well known location BUILDROOT
use it to generate provides and ignore it for installation.

Create a new find-provides and find-provides.ksyms for
lustre based on the one provided by the redhat-rpm-config
package using Module.symvers to supply the symbol versions
instead of extracting symbol versions from the .ko files.

Test-Parameters: trivial
HPE-bug-id: LUS-11383
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I01c3b3692e6a2a6be86a6930eaead9df75147f90
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49302
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
22 months agoLU-13705 utils: fix llstat -n option 89/49289/3
Andreas Dilger [Thu, 1 Dec 2022 00:04:26 +0000 (17:04 -0700)]
LU-13705 utils: fix llstat -n option

The '-n' option was not configured as a valid option for getopt
and would return an error if specified, instead of limiting the
number of stats outputs as it should.

Test-Parameters: trivial
Fixes: 3e0d994fbf4c ("LU-13705 utils: improve llstat/llobdstat usability")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ifacafce741854fe12b80ced28e95bc7cc9254035
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49289
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Feng Lei <flei@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
22 months agoLU-16353 config: enable_foo variables mustn't contains space 82/49282/2
Mr NeilBrown [Wed, 30 Nov 2022 00:47:03 +0000 (11:47 +1100)]
LU-16353 config: enable_foo variables mustn't contains space

$enable_crypto is in some circumstances set to "embedded llcrypt"
which contains a space.
When the code from lustre-build.m4 then tests the value with:

   if test x$enablecrypto = xyes

we get a syntax error from ./configure

We could add quotes to this comment, but for consistency we would need
to add quotes to ever other test for an enable_foo variable.

It is simpler just to ensure we don't add spaces.  So change the space
to a hyphen.

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I097e857409d6ec48a765ccda1cc470d28b90e601
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49282
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
22 months agoLU-16346 utils: fix lctl stack smashing 54/49254/4
Artem Blagodarenko [Fri, 25 Nov 2022 12:01:06 +0000 (12:01 +0000)]
LU-16346 utils: fix lctl stack smashing

on aarch64 architecture:
... 
exec lustre/utils/.libs/lctl dl
*** stack smashing detected ***: terminated
Aborted (core dumped)

genlmsg_parse() was misused in yaml_netlink_msg_parse().
It requires passing maxtype+1 elements, but maxtype+1 as a number
of elements passed actually. Should be maxtype actually.

Signed-off-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Change-Id: Ic9cd0de35a028ca28bdd112700296d21e04a1cc5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49254
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
22 months agoLU-14992 tests: add more mkdir_on_mdt0 calls 52/49252/3
Mr NeilBrown [Sun, 27 Nov 2022 20:49:50 +0000 (07:49 +1100)]
LU-14992 tests: add more mkdir_on_mdt0 calls

A previous patch changed some mkdir calls in test_133a to
mkdir_on_mdt0. This allows stats collected from mdt0 to
reflect the mkdir.

However two mkdir calls were missed, so "crossdir_rename" stats can be
wrong.

Test-Parameters: trivial mdscount=2 mdtcount=4 testlist=sanity env=ONLY=133a

Fixes: f0324c5c2f ("LU-14992 tests: sanity/replay-vbr mkdir on MDT0")
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I4e5c2e5504307462bff4012a13ef9deb24f8da8c
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49252
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
22 months agoLU-15816 tests: use correct ost host to manage failure 48/49248/2
Mr NeilBrown [Fri, 25 Nov 2022 05:13:20 +0000 (16:13 +1100)]
LU-15816 tests: use correct ost host to manage failure

sanity test_398m sets up striping across 2 OSTs.  It ensures that
failing IO to either OST individually will fail the total IO.

However it sends the command to fail IO for the second OST (OST1) to
the host managing the first OST (ost1).  If the first 2 OSTs are on
the same host, this works.  If not, it fails.

Also there error messages when testing the second stripe say "first
stripe".

Test-Parameters: trivial env=ONLY=398m
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Ic7085dab2610fa2c044a966fd8de40def0438ca4
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49248
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
22 months agoLU-16334 llite: update statx size/ctime for fallocate 21/49221/4
Qian Yingjin [Wed, 23 Nov 2022 07:44:47 +0000 (02:44 -0500)]
LU-16334 llite: update statx size/ctime for fallocate

In the VFS interface ->fallocate(), it should update i_size and
i_ctime returned by statx() accordingly when the file size grows.

Add sanity/150h.

fallocate() call does not update the attributes on MDT.
We use STATX with cached-always mode to verify it as it will not
send Glimpse lock RPCs to OSTs to obtain file size information
and use the caching attributes (size) on the client side as much
as possible.

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: Ib8128892222a01cd00250c704328bd13cfb12e2d
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49221
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
22 months agoLU-930 docs: add lfs-rm_entry.8 man page 64/49064/6
Andreas Dilger [Mon, 7 Nov 2022 21:56:24 +0000 (14:56 -0700)]
LU-930 docs: add lfs-rm_entry.8 man page

Add man page for "lfs rm_entry" and alias "lfs rmentry".

Test-Parameters: trivial
Fixes: 2ad263c602 ("LU-1187 utils: add lfs setdirstripe/getdirstripe")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I783f23ec8fd0c75c69bcc78c180a07e54dd0c8a1
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49064
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
22 months agoLU-16291 build: make kobj_type constant 43/49043/3
Jian Yu [Fri, 2 Dec 2022 19:27:43 +0000 (11:27 -0800)]
LU-16291 build: make kobj_type constant

Kernel v5.16-rc2-28-gee6d3dd4ed48:
commit ee6d3dd4ed48ab24b74bab3c3977b8218518247d
driver core: make kobj_type constant.

This patch makes struct kobj_type constant to fix
the following build failure against kernel 5.16:

lustre/obdclass/obd_config.c: In function 'class_modify_config':
lustre/obdclass/obd_config.c:1639:13: error: assignment discards
'const' qualifier from pointer target type [-Werror=discarded-qualifiers]
1639 |         typ = get_ktype(kobj);
     |             ^

Signed-off-by: Jian Yu <yujian@whamcloud.com>
Change-Id: I19e0d1f4e3cf97f6871e038487cda9294ac1f67b
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49043
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
22 months agoLU-16205 sec: reserve flag for fid2path for encrypted files 28/49028/3
Sebastien Buisson [Thu, 3 Nov 2022 10:47:46 +0000 (11:47 +0100)]
LU-16205 sec: reserve flag for fid2path for encrypted files

Reserve OBD_CONNECT2_ENCRYPT_FID2PATH connection flag for fid2path
support for encrypted files.
This connection flag is required so that newer servers continue to
return -ENODATA to older clients.

Test-Parameters: trivial
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I505b90a061687a7ef481adacca98908c96e487be
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49028
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
22 months agoLU-16159 lod: cancel update llogs upon recovery abort 84/48584/26
Lai Siyao [Sun, 28 Aug 2022 18:35:25 +0000 (14:35 -0400)]
LU-16159 lod: cancel update llogs upon recovery abort

If recovery is aborted, cancel update catalog from catlist, and keep
them on disk for some time (for debug purpose), as can avoid
accumulating stale update records, and also avoid recovery problems
if update llogs are corrupt.

Update llogs are canceled after recovery completes and before regular
request processing. For these logs, their ctime will be set, and log
header will be marked with LLOG_F_MAX_AGE|LLOG_F_RM_ON_ERR, and when
30 days passed, they will be removed automatically.

Tidy up recovery abort code:
* if obd_abort_recovery is set, or OBD is stopping, stop both
  client recovery and MDT recovery.
* otherwise if obd_abort_mdt_recovery is set, stop MDT recovery only.

lctl llog_print support printing update log FIDs used by specified
MDT:
* "lctl --device <MDT> llog_print update_log" will list all update
  llog FIDs used by this MDT device.

Disabled replay-single.sh 100c stripe check because abort_recovery
will cancel update llogs, and won't replay them upon next recovery.

Added replay-single.sh 100d.

Formatall in the end of replay-single.sh because directory unlink may
fail.

Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single,replay-single,replay-single,replay-single,replay-single,replay-single,replay-single,replay-single,replay-single
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Ie2bda6c097d65f5c51cba66c2dbf6ae4a5d36dda
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48584
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
22 months agoLU-15801 ldiskfs: Server support for RHEL9 69/47169/10
Shaun Tancheff [Tue, 15 Nov 2022 12:10:42 +0000 (06:10 -0600)]
LU-15801 ldiskfs: Server support for RHEL9

RHEL9 server patches update from SUSE 15 SP 4 series

Test-Parameters: trivial
HPE-bug-id: LUS-10920
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I221f946d09892bf90406da70aa16432e5753d18a
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/47169
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
22 months agoLU-16114 build: Update security_dentry_init_security args 59/48359/6
Shaun Tancheff [Sun, 28 Aug 2022 14:38:39 +0000 (21:38 +0700)]
LU-16114 build: Update security_dentry_init_security args

Linux commit v5.15-rc1-20-g15bf32398ad4
   security: Return xattr name from security_dentry_init_security()

Adjust security_dentry_init_security() calls accordingly

Test-Parameters: trivial
HPE-bug-id: LUS-11188
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I42d3307f7fe0d2412381363f60ac5b3df2d5891a
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48359
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
22 months agoLU-16112 build: ki_complete removed unused argument 57/48357/9
Shaun Tancheff [Wed, 30 Nov 2022 12:12:51 +0000 (06:12 -0600)]
LU-16112 build: ki_complete removed unused argument

Linux commit v5.15-rc6-145-g6b19b766e8f0
   fs: get rid of the res2 iocb->ki_complete argument

Prior to 4.1 Linux provided an aio_complete(iocb, res, res2)
which propagated res2 to io_event.res2. This functionality
migrated to iocb->ki_complete().

Provide a wrapper around iocb->ki_complete() to use
aio_complete() or iocb->ki_complete() as appropriate.

HPE-bug-id: LUS-11187
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I11d1ee61528d4d89e2a316fd71066824b202dac7
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48357
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
22 months agoLU-15581 utils: add check_iam util 75/46575/17
Artem Blagodarenko [Wed, 7 Sep 2022 12:46:54 +0000 (08:46 -0400)]
LU-15581 utils: add check_iam util

A tool for parsing and checking IAM files.
And a test to check utility works without segfaults for
corrupted files.

To process all files in OI catalog:
for f in /root/md65_ldiskfs/oi.16.*; do
echo $f; lustre/utils/check_iam -v $f;
done > output.txt 2>&1

Test-Parameters: trivial testlist=conf-sanity env=ONLY=134
HPE-bug-id: LUS-10501
Change-Id: I7a8e83bc2720040e48c953511801816fd3dd6288
Signed-off-by: Artem Blagodarenko <artem.blagodarenko@hpe.com>
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/46575
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
22 months agoLU-6142 lustre: fix minor typos in comments 74/49274/2
Mr. NeilBrown [Tue, 29 Nov 2022 17:08:38 +0000 (12:08 -0500)]
LU-6142 lustre: fix minor typos in comments

Fix minor typos in comments.

Linux-commit: d88727b ("staging: lustre: fix minor typos in comments")

Test-Parameters: trivial
Change-Id: I2232597d261c8d33d21bdfe690a5b7460bf4069d
Signed-off-by: Mr. NeilBrown <neilb@suse.de>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49274
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
22 months agoLU-15707 lod: force creation of a component without a pool 55/46955/9
Etienne AUJAMES [Wed, 30 Mar 2022 16:43:44 +0000 (18:43 +0200)]
LU-15707 lod: force creation of a component without a pool

This patch add the pool option "lfs setstripe -p ignore" to force
the creation of component without a pool set by inheritance (from
parent or root).

e.g:
$ lfs setstripe -p pool tdir
$ lfs setstripe -E1M -p ignore -E-1 -p '' -c2 -S2M tdir/tfile
$ lfs getstripe -I1 -p tdir/tfile
(no pool set)
$ lfs getstripe -I2 -p tdir/tfile
pool
(inherited from tdir)

This patch add the test "ost-pools test_32" to verify this behavior.

The poorly-named "-p none" keyword, which indicates the pool name
should be inherited from the root or parent dir layout, will be
eventually replaced by the new "-p inherit" keyword.

Test-Parameters: serverdistro=el7.9 serverversion=2.12.8 testlist=ost-pools env=ONLY=32,ONLY_REPEAT=50
Test-Parameters: testlist=ost-pools env=ONLY=32,ONLY_REPEAT=50
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Change-Id: I782cbafe209cff6857162303a4650f5e3b438be5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/46955
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
22 months agoLU-16231 misc: fix stats snapshot_time to use wallclock 21/48821/7
Andreas Dilger [Tue, 11 Oct 2022 09:09:08 +0000 (03:09 -0600)]
LU-16231 misc: fix stats snapshot_time to use wallclock

The timestamps reported during stats collection inadvertently changed
from being POSIX epoch timestamps to elapsed-from-boot timestamps.

While some collection tools ignore these timestamps, or only use the
delta between successive reads, having uniform timestaps in stats
files simplifies stats correlation between different servers.

Revert the snapshot_time back to showing wallclock time.

Some "init" times were not initialized when stats were allocated or
cleared, do this for all stats shown by lprocfs_stats_header().

Rename struct osc_device fields from od_ to osc_ to avoid confusion
with struct osd_device. Having two od_stats was especially confusing.

Add a test case to verify snapshot_time, start_time, elapsed_time.

Test-Parameters: testlist=sanity env=ONLY=127a,ONLY_REPEAT=100
Fixes: ea2cd3af7b ("LU-11407 obdclass: add start time to stats files")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I421c3b0301c2566b48c2fc6fe7bb8b54ec48ca5d
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48821
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Ellis Wilson <elliswilson@microsoft.com>
Reviewed-by: Feng Lei <flei@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
22 months agoLU-16110 lprocfs: make job_stats and rename_stats valid YAML 17/48417/27
Lei Feng [Fri, 2 Sep 2022 07:05:22 +0000 (15:05 +0800)]
LU-16110 lprocfs: make job_stats and rename_stats valid YAML

Adjust the format of job_stats and rename_stats to make
them valid YAML.  This fixes the output to correctly indent
the items to follow YAML formatting rules.

Add a test case to verify the format of these params is valid
YAML to avoid other errors being introduced in the future.

Signed-off-by: Lei Feng <flei@whamcloud.com>
Change-Id: Idca36621241e97ff87f8ab0448f3c5604057a460
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48417
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
22 months agoLU-16344 docs: Improve explaination in manual of lfs-getstripe 56/49256/2
Xing Huang [Mon, 28 Nov 2022 11:29:32 +0000 (19:29 +0800)]
LU-16344 docs: Improve explaination in manual of lfs-getstripe

Modify the --ost|-O explaination in manual of lfs-getstripe.

Test-Parameters: trivial
Signed-off-by: Xing Huang <hxing@ddn.com>
Change-Id: I0b99906d07ac23126914d75c70efe4899069d507
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49256
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
22 months agoLU-12837 doc: add lfs-changelog* manpages 09/49209/3
Etienne AUJAMES [Tue, 22 Nov 2022 12:39:25 +0000 (13:39 +0100)]
LU-12837 doc: add lfs-changelog* manpages

This patch moves the documentation for "lfs changelog" and "lfs
changelog_clear" utilities from "lfs.1" to the following manpages:
- lfs-changelog.1
- lfs-changelog_clear.1

Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Test-Parameters: trivial
Change-Id: I6db2e687e506a6116fe4755358a9abbd5509c3bb
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49209
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
22 months agoLU-13665 lfs: use correct DST setting for mktime() 06/49206/3
Mr NeilBrown [Tue, 22 Nov 2022 05:09:59 +0000 (16:09 +1100)]
LU-13665 lfs: use correct DST setting for mktime()

When lfs is passed a "-newerXY" arg when Y=='t' and the arg doesn't
start %H, it leaves ->tm_isdst set to 0 which tells mktime() to assume
that DST is not active.
This means that it produces incorrect results for times when DST is
active.

We should set ->tm_isdst to -1 to tell mktime() that it is not known
whether DST is active.  Then mktime() will use the timezone database
to determine the correct DST setting.

This allows us to re-enable test 56oc on all platforms.

Test-Parameters: trivial clientdistro=sles15sp3 env=ONLY=56
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I94afba96e2563442786726096501c5ec0b40a881
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49206
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
22 months agoLU-16313 pcc: use two bits to indicate pcc type for attach 60/49160/5
Qian Yingjin [Tue, 15 Nov 2022 06:57:08 +0000 (01:57 -0500)]
LU-16313 pcc: use two bits to indicate pcc type for attach

PCC currenty supports two types: readwrite and readonly.
The attach data structure @lu_pcc_attach is using 32 bit value to
indicate the PCC type:
struct lu_pcc_attach {
__u32 pcca_type;
__u32 pcca_id;
};

In this patch, it changes to use 2 bits to represent the PCC type.
The left bits in @pcca_type can be used as flags for attach such
as a flag to indicate using the asynchronous attach via the
command "lfs pcc attach -A" for PCCRO.

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: Idee26018642a174b04d1d36a81952ea98a06514e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49160
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Feng Lei <flei@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
22 months agoLU-15544 osd-ldiskfs: Update bio_set_dev and BIO_MAX_VECS 35/49135/5
Shaun Tancheff [Fri, 11 Nov 2022 08:19:53 +0000 (02:19 -0600)]
LU-15544 osd-ldiskfs: Update bio_set_dev and BIO_MAX_VECS

Linux commit v5.11-rc5-9-g309dca309fc3
  block: store a block_device pointer in struct bio
created bio_set_dev macro and
Linux commit v5.15-rc6-127-gcf6d6238cdd3
  block: turn macro helpers into inline functions
change the macro to an inline function for bio_set_dev.
This change tests for bio_set_dev and provides one, if one
is not provided by the kernel.

Linux commit v5.12-rc1-20-ga8affc03a9b3
   block: rename BIO_MAX_PAGES to BIO_MAX_VECS
This change provide a fallback for older kernels when
BIO_MAX_VECS is not defined.

Linux commit v5.11-rc4-8-g5857c9209ce5
  mm: Mark anonymous struct field of 'struct vm_fault' as 'const'
Breaks and exisiting configure test for vm_fault.address
This changes the configure test for vm_fault.address so it does
not fail due to address being const

Test-Parameters: trivial
HPE-bug-id: LUS-10744
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I06d3bf60e32b969e1e635e378cbd1ee36293165c
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49135
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
22 months agoLU-16308 llite: wake_up after cl_object_kill 30/49130/7
Lai Siyao [Thu, 10 Nov 2022 13:15:51 +0000 (08:15 -0500)]
LU-16308 llite: wake_up after cl_object_kill

cl_inode_fini() calls cl_object_kill() to set LU_OBJECT_HEARD_BANSHEE,
and then calls cl_object_put_last() to wait for object refcount to
become one, It should wake_up() in the middle in case someone is
waiting on the flag.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I244db71ee4ed9c39118e443b99c3b8a3a0aa4bc3
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49130
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Zhenyu Xu <bobijam@hotmail.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
22 months agoLU-16295 kernel: kernel update RHEL 7.9 [3.10.0-1160.80.1.el7] 45/49045/5
Jian Yu [Fri, 4 Nov 2022 07:15:39 +0000 (00:15 -0700)]
LU-16295 kernel: kernel update RHEL 7.9 [3.10.0-1160.80.1.el7]

Update RHEL 7.9 kernel to 3.10.0-1160.80.1.el7.

Test-Parameters: trivial clientdistro=el7.9 serverdistro=el7.9

Signed-off-by: Jian Yu <yujian@whamcloud.com>
Change-Id: I50a0ee572d24ddc73f8af6dc32ef701c260e45b7
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49045
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Colin Faber <cfaber@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
22 months agoLU-16262 tests: Remove sanity-gss.sh 48/48948/6
Arshad Hussain [Fri, 4 Nov 2022 05:10:47 +0000 (10:40 +0530)]
LU-16262 tests: Remove sanity-gss.sh

Purpose of sanity-gss is to test just the
GSSAPI code itself. This is done by making
use of the gssnull/null security flavor.
Currently, this is exercised through
non-regression tests with SSK and therefore
sanity-gss.sh is not required. This patch
removes sanity-gss.sh from repo due to above
reasons.

Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: Ief62e8111cafdc5bebca1f47a1b09fbafb152a76
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48948
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
22 months agoLU-16087 lprocfs: add histogram to stats counter 78/48278/12
Lei Feng [Wed, 17 Aug 2022 00:48:33 +0000 (08:48 +0800)]
LU-16087 lprocfs: add histogram to stats counter

Add histogram to stats counter.
Enable histogram for read/write_bytes in mdt/obdfilter
job stats.

Sample job_stats:
- job_id:          md5sum.0
snapshot_time   : 3143196.864165417 secs.nsecs
start_time      : 3143196.707206168 secs.nsecs
elapsed_time    : 0.156959249 secs.nsecs
  read_bytes:      { samples: 2, ..., hist: { 32K: 1, 1M: 1 } }
  write_bytes:     { samples: 1, ..., hist: { 1K: 1 } }

Signed-off-by: Lei Feng <flei@whamcloud.com>
Change-Id: I75b6909c8b63f08b74c3c411ff3dcd27881bb839
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48278
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Shuichi Ihara <sihara@ddn.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
22 months agoLU-15998 pcc: set hsm-root correctly during copytool setup 09/47909/3
Qian Yingjin [Fri, 8 Jul 2022 02:59:36 +0000 (22:59 -0400)]
LU-15998 pcc: set hsm-root correctly during copytool setup

During copytool setup, we set --hsm-root with the archive root
path of $SINGLEAGT. However, when set --hsm-root explicitly via
"-h|--hsm-root", it should reset the hsm root with the specified
one. Otherwise, it will cuase sanity-pcc/test_3b failed.

Test-Parameters: clientcount=3 testlist=sanity-pcc,sanity-pcc,sanity-pcc
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: Ida6c1ff7459548b068fd62ce315fe8075633b5fc
Signed-off-by: Yang Sheng <ys@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/47909
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
22 months agoLU-8837 lustre: remove target declarations from obd.h 52/41952/11
Mr NeilBrown [Wed, 23 Nov 2022 21:35:25 +0000 (16:35 -0500)]
LU-8837 lustre: remove target declarations from obd.h

lu_target.h and obd_target.h are only needed in obd.h
for some structs in obd_device.u.  We don't really need to mention
these structs in the union as they are all quite small.

So we can define accessor function that cast a pointer to the union
into the required type, and then we can completely remove these
includes from obd.h

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I9b314b0bfc1baae03ccb8eadf134964ea308f638
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/41952
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
22 months agoLU-8837 target: don't build any 'target' on client. 68/41768/9
Mr NeilBrown [Fri, 13 Nov 2020 02:28:23 +0000 (13:28 +1100)]
LU-8837 target: don't build any 'target' on client.

Nothing in the 'target/' directory is needed on the client,
so don't build it for client-only builds

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I9a63c1a11c7b44edadc355bd323381ba1951376f
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/41768
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
22 months agoLU-13485 build: Parallel build configure cache fixes 49/49149/3
Shaun Tancheff [Mon, 14 Nov 2022 15:43:32 +0000 (09:43 -0600)]
LU-13485 build: Parallel build configure cache fixes

This fixes the infrastructure for parallel builds when cached
enabled results are enabled, the critical fix being proper
usage of AC_CACHE_CHECK in the LB2_LINUX_TEST_RESULT macro.

Test-Parameters: trivial
Fixes: b0209c2d4d ("LU-13485 build: Enable 2 stage configure tests")
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I94d7ff291dfd4f2dc5e218acc811329b986f8fbf
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49149
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
22 months agoLU-10391 lnet: change lnet_find_best_lpni to handle large NIDs 81/49181/3
James Simmons [Mon, 21 Nov 2022 14:23:56 +0000 (09:23 -0500)]
LU-10391 lnet: change lnet_find_best_lpni to handle large NIDs

Currently lnet_find_best_lpni() only handles small NID addresses
for the dst_nid. Change this to large NID address to allow IPv6
and other large address protocols.

Test-Parameters: trivial testlist=sanity-lnet
Test-Parameters: serverversion=2.12 serverdistro=el7.9 testlist=runtests
Test-Parameters: clientversion=2.12 testlist=runtests
Signed-off-by: James Simmons <jsimmons@infradead.org>
Change-Id: I23ef73f5955a3016262d096706d5cf00ffa4abda
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49181
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
22 months agoLU-15643 osd-ldiskfs: don't trigger scrub on irreparable FIDs 52/46852/12
Lai Siyao [Tue, 15 Mar 2022 19:43:14 +0000 (15:43 -0400)]
LU-15643 osd-ldiskfs: don't trigger scrub on irreparable FIDs

In osd_fid_lookup(), if the FID mapping found in OI table is insane,
it will be added into a list called os_inconsistent_items, and OI
scrub will be triggered.

Later if OI scrub can't fix this mapping, it should move this mapping
into a list called os_stale_items, and subsequent access of the same
FID should return -ESTALE immediately, other than trigger OI
scrub repeatedly.

Add sanity-scrub 20. Remove sanity-scrub 1d, which is not a sane test
because it altered FID in LMA, which is the last to trust for an
object, and it could pass just by chance.

Test-Parameters: mdscount=2 mdtcount=4 testlist=sanity-scrub,sanity-scrub,sanity-scrub,sanity-scrub,sanity-scrub,sanity-scrub,sanity-scrub,sanity-scrub
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I3ed8928506551416b1008121adbe385dedda29bc
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/46852
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
23 months agoLU-16317 build: dkms build requires flex, bison and libmount-devel 83/49183/2
Jian Yu [Thu, 17 Nov 2022 20:48:04 +0000 (12:48 -0800)]
LU-16317 build: dkms build requires flex, bison and libmount-devel

This patch fixes lustre.spec.in and lustre-dkms.spec.in to add
requires for flex, bison, libmount and libmount-devel. The last
two have already been added into lustre.spec.in.

Test-Parameters: trivial

Fixes: 121a79651f ("LU-15967 build: configure script does not check for required build tools")
Fixes: f21b944127 ("LU-15940 build: add a required dependency for libmount")

Change-Id: I9923fc7eb09f974e8c38c3664138486a424e16d7
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49183
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
23 months agoLU-12837 doc: add llapi_changelog* manpages 84/49084/6
Etienne AUJAMES [Wed, 9 Nov 2022 16:11:31 +0000 (17:11 +0100)]
LU-12837 doc: add llapi_changelog* manpages

This patch adds the following manpages for API changelog interface:
- llapi_changelog_clear.3
- llapi_changelog_fini.3
- llapi_changelog_free.3
- llapi_changelog_get_fd.3
- llapi_changelog_in_buf.3
- llapi_changelog_recv.3
- llapi_changelog_set_xflags.3
- llapi_changelog_start.3

This patch also cleans outdated comments about changelogs in Lustre
API files.

Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Test-Parameters: trivial
Change-Id: Ie1481b820661e9e0ce9277ba4c65f51774f1e6bb
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49084
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
23 months agoLU-16304 kernel: kernel update RHEL8.7 [4.18.0-425.3.1.el8] 80/49080/3
Jian Yu [Tue, 15 Nov 2022 00:34:46 +0000 (16:34 -0800)]
LU-16304 kernel: kernel update RHEL8.7 [4.18.0-425.3.1.el8]

Update RHEL8.7 kernel to 4.18.0-425.3.1.el8.

Test-Parameters: trivial fstype=ldiskfs \
clientdistro=el8.7 serverdistro=el8.7 testlist=sanity

Test-Parameters: trivial fstype=zfs \
clientdistro=el8.7 serverdistro=el8.7 testlist=sanity

Signed-off-by: Jian Yu <yujian@whamcloud.com>
Change-Id: I13e6d83ada1ec0c4da92f307bf56db5281c41892
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49080
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
23 months agoLU-16303 lnet: Drop LNet message if deadline exceeded 78/49078/5
Chris Horn [Mon, 7 Nov 2022 22:06:32 +0000 (15:06 -0700)]
LU-16303 lnet: Drop LNet message if deadline exceeded

The LNet message deadline is set when a message is committed for
sending. A message can be queued while waiting for send credit(s)
after it has been committed. Thus, it is possible for a message
deadline to be exceeded while on the queue. We should check for this
when posting messages to LND layer.

HPE-bug-id: LUS-11333
Test-Parameters: trivial testlist=sanity-lnet env=ONLY=253,ONLY_REPEAT=100
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I1315b2351536e63b9d4f22d9336a57415031e0c7
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49078
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
23 months agoLU-15898 tests: Move sanity/115 & 802a to conf-sanity 69/47469/8
Arshad Hussain [Fri, 27 May 2022 07:38:46 +0000 (03:38 -0400)]
LU-15898 tests: Move sanity/115 & 802a to conf-sanity

sanity/115 and sanity/802a reformats filesystem making
the whole sanity slow. These two tests are moved under
conf-sanity.

sanity/115 is now conf-sanity/114
sanity/802a is now conf-sanity/802a

This patch also replaces trap with stack_trap for
conf-sanity/134a and adds 133 to EXCEPT_SLOW list
under conf-sanity. It removes test 115 from
EXCEPT_SLOW list under sanity.

Test-Parameters: trivial env=SLOW=yes,ONLY="114 115" testlist=conf-sanity
Test-Parameters: trivial fstype=zfs env=ONLY=802a testlist=conf-sanity
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: Iacb5aa8c5535a30669af3f94be97b68ee8688ccf
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/47469
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
23 months agoLU-15544 ldiskfs: SUSE 15 SP4 kernel 5.14.21 SUSE 04/46504/18
Shaun Tancheff [Tue, 15 Nov 2022 12:03:21 +0000 (06:03 -0600)]
LU-15544 ldiskfs: SUSE 15 SP4 kernel 5.14.21 SUSE

Updated patch series for SUSE 15 SP4 kernel 5.14.21 based on 5.10

Linux commit v5.14-rc2-19-g188c299e2a26
   ext4: Support for checksumming from journal triggers
Results in ext4_journal_get_write_access() having 4 arguments.
This change provides a compat wrapper for older kernels.

Linux commit v5.12-rc4-7-g471fbbea7ff7
  ext4: handle casefolding with encryption
This change impacts directory entry hash calculation and impacts
EXT4_DIR_REC_LEN and EXT4_DIR_ENTRY_LEN macros which now requires
the inode parent dir's inode. Similarly ext4fs_dirhash() also
takes the inode parent dir's inode.
This changes provides a compat wrapper for ext4fs_dirhash
to support older kernels.

Patches dropped due to upstream ext4 landings:
  linux-5.9/ext4-simple-blockalloc.patch
  base/ext4-projid-xattrs.patch
  linux-5.8/ext4-enc-flag.patch

Test-Parameters: trivial
HPE-bug-id: LUS-10744
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Ic50227eaa231e2f1e98f4a7c9e5838e3303cbdf6
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/46504
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
23 months agoLU-8585 llapi: use open_by_handle_at in llapi_open_by_fid 03/36603/21
Quentin Bouget [Sun, 2 Jan 2022 16:12:42 +0000 (11:12 -0500)]
LU-8585 llapi: use open_by_handle_at in llapi_open_by_fid

Reimplement llapi_open_by_fid() to use llapi_fid_to_handle() and
open_by_handle_at(2) rather than using ioctl().  This works for
opens on subdirectory mountpoints, unlike ".lustre/fid/<fid>".

This patch also adds llapi_open_by_fid_at() which is similar to
llapi_open_by_fid() except that it takes an open directory file
descriptor or AT_CWD rather than a path as its first argument.

[AD:
- Move get_root_*() functions over to a new liblustreapi_root.c
  file in expectation of further enhancements to that code.
- Cache an open file handle on the root directory so repeated
  calls to llapi_open_by_fid() and llapi_fid2path() do not need
  to search for and open the same root directory path many times.
- Add man pages for newly-added functions.

  This reduces the system calls for llapi_fid_test significantly:

      original     patched
         14511        4315   total opens
         64807       34067   total syscalls
]

There may still be a need to have a fallback from open_by_handle_at()
to using ".lustre/fid/<FID>" to open the fid (if available), but
that can be added if this initial patch does not test well.  The
open_by_handle_at() method avoids reopening the "fid/" directory
each time (though this fd could also be cached), but it has the
drawback that it reconnects dentries to the root directory each time.

Signed-off-by: Quentin Bouget <quentin.bouget@cea.fr>
Change-Id: I8a4904c996389da2b0894cd9fac639a398607535
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/36603
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
23 months agoLU-6142 clio: make cp_ref in cl_page a refcount_t 72/49072/3
Mr. NeilBrown [Tue, 8 Nov 2022 17:23:04 +0000 (12:23 -0500)]
LU-6142 clio: make cp_ref in cl_page a refcount_t

As this is used as a refcount, it should be declared
as one.

Change-Id: I8108e14e545bb56aae34a0f6ae9d5a04227fc067
Signed-off-by: Mr. NeilBrown <neilb@suse.de>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49072
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
23 months agoLU-15748 ofd: fix fallocate interop for older clients 48/47548/17
Andreas Dilger [Mon, 6 Jun 2022 23:14:32 +0000 (17:14 -0600)]
LU-15748 ofd: fix fallocate interop for older clients

The logic for detecting older client fallocate was backward, and
should be checking if the OBD_CONNECT_OLD_FALLOC patch was *not*
present to detect old clients.

Since the new server does not have OLD_FALLOC in the SUPPORTED flags,
it will clear the flag from the export at connect time, so it needs
to be saved in the export early on.

Test-Parameters: testlist=sanityn env=ONLY=16,HONOR_EXCEPT=y
Test-Parameters: serverversion=2.14 testlist=sanityn env=ONLY=16,HONOR_EXCEPT=y
Fixes: 7905359296 ("LU-15748 osc: fallocate interop for 2.14 clients")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I327183025a8de6fd814a7c2929365497153ebbe5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/47548
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
23 months agoLU-10391 lnet: router_discover - handle large addrs in ping 31/44631/7
Mr NeilBrown [Tue, 8 Nov 2022 15:37:12 +0000 (10:37 -0500)]
LU-10391 lnet: router_discover - handle large addrs in ping

lnet_router_discover_ping_reply() now considers the large
nids in the ping message.

Test-Parameters: trivial testlist=sanity-lnet
Test-Parameters: serverversion=2.12 serverdistro=el7.9 testlist=runtests
Test-Parameters: clientversion=2.12 testlist=runtests
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Ia67bcf2b09c976d9e4bf49a409e0d7bffe778ba4
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/44631
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
23 months agoLU-10391 lnet: lnet_peer_merge_data to understand large addr 30/44630/7
Mr NeilBrown [Thu, 17 Nov 2022 15:05:25 +0000 (10:05 -0500)]
LU-10391 lnet: lnet_peer_merge_data to understand large addr

Large addr now understood.

Test-Parameters: trivial testlist=sanity-lnet
Test-Parameters: serverversion=2.12 serverdistro=el7.9 testlist=runtests
Test-Parameters: clientversion=2.12 testlist=runtests
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Ib197e0183d202eb6a189b6668239744b2369f534
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/44630
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
23 months agoLU-15619 osc: pack osc_async_page better 21/46721/9
Andreas Dilger [Fri, 4 Nov 2022 09:13:34 +0000 (03:13 -0600)]
LU-15619 osc: pack osc_async_page better

The oap_cmd field was used to store a number of other flags, but
those were redundant with oap_brw_page.flag, and never used.
That allows shrinking oap_cmd down to 2 bits.

Modern GCC allows specifying a bitfield for an enum, so the size
can be explicitly set.

The oap_page_off always holds < PAGE_SIZE, so it can safely fit
into PAGE_SHIFT bits, similar to ops_from. However, since this
field is used in math operations and we don't need the space,
always allocate it as an aligned 16-bit field.

This allows packing oap_async_flags, oap_cmd, and oap_page_off
into a 32-bit space.  This avoids having holes in the struct. The
explicit oap_padding fields are needed so that "packed" does not
cause the fields to be misaligned, but still allows packing with
the following 4-byte field in osc_page.

Also move oap_brw_page to the end of the struct, since the
bp_padding field therein is useless and can be removed. This
allows better packing with the bitfields in struct osc_page.

    brw_page       old size:  32, holes: 0, padding: 4
    brw_page       new size:  28, holes: 0, padding: 0
    osc_async_page old size: 104, holes: 8, padding: 4
    osc_async_page new size:  92, holes: 0, bit holes: 10
    osc_page       old size: 144, holes: 8, bit holes:  4
    osc_page       new size: 128, holes: 0, bit holes:  4

Together this saves 16 bytes *per page* in cache,
and fits osc_page into a noce-sized allocation.
That is 512MiB on a system with 128GiB of cache.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ief6aa7664d7299dba02332bc9029e4e9219d0876
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/46721
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
23 months agoLU-16309 tests: skip test_150b() for EOPNOTSUPP 36/49136/4
Elena Gryaznova [Fri, 11 Nov 2022 09:53:47 +0000 (10:53 +0100)]
LU-16309 tests: skip test_150b() for EOPNOTSUPP

Fix sanity:test_150b() to really be skipped if
check_fallocate returs EOPNOTSUPP.

Fixes: 2f496148c3 ("LU-15551 ofd: Return EOPNOTSUPP instead of EPROTO")
Test-Parameters: trivial testlist="sanity" env=ONLY="150b"
Signed-off-by: Elena Gryaznova <elena.gryaznova@hpe.com>
HPE-bug-id: LUS-10961
Reviewed-by: Vladimir Saveliev <vladimir.saveliev@hpe.com>
Change-Id: I253b89bb3dd047434c7fa0e91bb0faefef24e128
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49136
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
23 months agoLU-16294 kernel: kernel update SLES15 SP4 [5.14.21-150400.24.28.1] 46/49046/3
Jian Yu [Fri, 4 Nov 2022 07:25:26 +0000 (00:25 -0700)]
LU-16294 kernel: kernel update SLES15 SP4 [5.14.21-150400.24.28.1]

Update SLES15 SP4 kernel to 5.14.21-150400.24.28.1 for Lustre client.

Test-Parameters: trivial clientdistro=sles15sp4 \
env=SANITY_EXCEPT="27J 101j 244a" testlist=sanity

Signed-off-by: Jian Yu <yujian@whamcloud.com>
Change-Id: I651894274a09b6240f321e787736d298c5dc41ce
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49046
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
23 months agoLU-15938 llog: more checks in llog_reader 12/48112/6
Mikhail Pershin [Tue, 2 Aug 2022 12:41:52 +0000 (15:41 +0300)]
LU-15938 llog: more checks in llog_reader

Add more correctness checks and reports in llog_reader:
- better report wrong record length and chunk skipping case
- add tail check: tail id and len should be the same as in head
- better report for gap in record indeces
- test case with two corruption types:
  1) llog has bits set in bitmap beyond file end
  2) corruption in the middle

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I0c2af6ae2592c94e14e90ead12e28104409313b2
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48112
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexander <alexander.boyko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
23 months agoLU-15544 osd: Handle removal of EXT4_GET_BLOCKS_KEEP_SIZE 34/49134/4
Shaun Tancheff [Tue, 15 Nov 2022 03:29:50 +0000 (21:29 -0600)]
LU-15544 osd: Handle removal of EXT4_GET_BLOCKS_KEEP_SIZE

Linux commit v5.6-rc4-4-g4337ecd1fe99
  ext4: remove EXT4_EOFBLOCKS_FL and associated code

This change removes the define for EXT4_EOFBLOCKS_FL
as well as the enum value EXT_INODE_EOFBLOCKS.

Linux commit v5.7-rc4-4-g9e52484c7133
  ext4: remove EXT4_GET_BLOCKS_KEEP_SIZE flag

This change removes the define for EXT4_GET_BLOCKS_KEEP_SIZE

Fix the usage in osd_io.c to check for the existence of the
related definitions and remove the configure test.

Test-Parameters: trivial
Fixes: 791f656a03 ("LU-14776 ldiskfs: Add Ubuntu 20.04 HWE support")
HPE-bug-id: LUS-10744
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I883708ce2879f4d7d41ad3f7b75db2e751f6eb9b
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49134
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
23 months agoLU-16216 tests: Update sanity-lnet for other LNDs 88/48788/4
Chris Horn [Mon, 3 Oct 2022 18:20:20 +0000 (12:20 -0600)]
LU-16216 tests: Update sanity-lnet for other LNDs

Modify various sanity-lnet test cases to allow them to execute on
other LNDs.

Some tests only work or make sense with socklnd, so we add explicit
checks for NETTYPE == tcp to these tests.

kfilnd doesn't currently support LNet drop rules, so any tests cases
that utilize those are skipped for NETTYPE == kfi.

Two other fixes are included here:
 - test_230 doesn't check the correct default value of conns_per_peer
   in cases where the conns_per_peer parameter is set or when the link
   speed causes a value other than 1 to be used.
 - test_250 should be skipped in cases where the skip_mr_route_setup
   module parameter is > 0.

HPE-bug-id: LUS-10852
Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I60f4c49d44d81b00bea01ff1f65adb6f20674bbf
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48788
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
23 months agoLU-14736 utils: update leak-finder.pl for new format 83/43983/10
Emoly Liu [Wed, 11 May 2022 12:12:00 +0000 (20:12 +0800)]
LU-14736 utils: update leak-finder.pl for new format

Update leak-finder.pl to handle some of the newer log formats,
so that it produces more useful results. The changes include:
- add option "--by_func" to sort leak logs by function name in
  ascending order;
- add option "--summary" to print a summary report by the total
  number of leak bytes of each function in ascending order in
  YAML format,
  "- { func: <function_name>, alloc_bytes: <bytes>,
       leak_count: <count>, leak_bytes: <bytes> },"
- define LIBCFS_MEM_MSG() to print alloc/free log in a uniform
  format, as follows, also define LIBCFS_ALLOC_POST() and
  LIBCFS_FREE_PRE():
  mask:subs:cpu:epoch second.usec:?:pid:?:
  (filename:line:function_name()) alloc-type 'var_name': size at
  memory_address
- correct some alloc/free debug messages in lnet part.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Emoly Liu <emoly@whamcloud.com>
Change-Id: Idee539f7aebbd49a52fe5254d292860c283ebbe5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/43983
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Zhenyu Xu <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
23 months agoLU-10391 lnet: change lnet_debug_peer() to struct lnet_nid 35/44635/10
Mr NeilBrown [Tue, 6 Jul 2021 05:19:31 +0000 (15:19 +1000)]
LU-10391 lnet: change lnet_debug_peer() to struct lnet_nid

lnet_debug_peer() now takes 'struct lnet_nid *'.

Test-Parameters: trivial testlist=sanity-lnet
Test-Parameters: serverversion=2.12 serverdistro=el7.9 testlist=runtests
Test-Parameters: clientversion=2.12 testlist=runtests
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Iba496dc2008228046eff0092cf25d98b9409e771
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/44635
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
23 months agoLU-10391 lnet: discard lnet_nid2ni_*() 34/44634/10
Mr NeilBrown [Mon, 10 Oct 2022 22:53:22 +0000 (18:53 -0400)]
LU-10391 lnet: discard lnet_nid2ni_*()

These 'struct lnet_ni' lookup functions which take a nid4, are
discarded in favour of the versions which take a 'struct lnet_nid'.

Test-Parameters: trivial testlist=sanity-lnet
Test-Parameters: serverversion=2.12 serverdistro=el7.9 testlist=runtests
Test-Parameters: clientversion=2.12 testlist=runtests
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I4f4cb154e7778ac5b68cec3a40fc0d9a8a1db480
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/44634
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
23 months agoLU-10391 lnet: change lnet_notify() to take struct lnet_nid 33/44633/13
Mr NeilBrown [Tue, 8 Nov 2022 21:13:59 +0000 (16:13 -0500)]
LU-10391 lnet: change lnet_notify() to take struct lnet_nid

lnet_notify() now takes a 'struct lnet_nid *' instead of a
lnet_nid_t.

Test-Parameters: trivial testlist=sanity-lnet
Test-Parameters: serverversion=2.12 serverdistro=el7.9 testlist=runtests
Test-Parameters: clientversion=2.12 testlist=runtests
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I4c3ab0eea5202028ee881eee04bdd1014f7f150d
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/44633
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
23 months agoLU-10391 lnet: find correct primary for peer 32/44632/11
Mr NeilBrown [Tue, 6 Jul 2021 03:47:02 +0000 (13:47 +1000)]
LU-10391 lnet: find correct primary for peer

If the peer has a large-address for the primary, it can now be found.

Test-Parameters: trivial testlist=sanity-lnet
Test-Parameters: serverversion=2.12 serverdistro=el7.9 testlist=runtests
Test-Parameters: clientversion=2.12 testlist=runtests
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Ie14ae970254bb58e26970bb09e7645979daa418a
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/44632
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
23 months agoLU-10391 lnet: extend lnet_is_nid_in_ping_info() 29/44629/16
Mr NeilBrown [Mon, 15 Aug 2022 15:57:57 +0000 (11:57 -0400)]
LU-10391 lnet: extend lnet_is_nid_in_ping_info()

lnet_is_nid_in_ping_info() now checks the ping_info for both
nid4 and larger nids.

Test-Parameters: trivial testlist=sanity-lnet
Test-Parameters: serverversion=2.12 serverdistro=el7.9 testlist=runtests
Test-Parameters: clientversion=2.12 testlist=runtests
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I7555947203acb5e5c6025ccb1ec5fba60bbf2f31
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/44629
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
23 months agoLU-16307 util: fix lnetctl bugs that break sanity-sec 29/49129/5
James Simmons [Fri, 11 Nov 2022 02:40:04 +0000 (21:40 -0500)]
LU-16307 util: fix lnetctl bugs that break sanity-sec

For lnetctl net commands you always need a --net option. For the
case of creating a local NI you need a list of interfaces.
For yaml_lnet_config_ni() I was always requiring a list of
interfaces which is wrong. Only test for missing interfaces
for the NLM_F_CREATE case.

Second bug is if we fail to detect Netlink we use the old api.
We end up freeing some needed parameters instead of passing
them to old APIs. Jump to old api code without freeing parameter
data.

Fixes: 8f8f6e2f3 ("LU-10003 lnet: use Netlink to support old and new NI APIs.")
Change-Id: I4b11372b9afa8023bcbaba3297dd04196d22ef05
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49129
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
23 months agoLU-16293 kernel: kernel update RHEL9.0 [5.14.0-70.30.1.el9_0] 44/49044/3
Jian Yu [Fri, 4 Nov 2022 07:10:20 +0000 (00:10 -0700)]
LU-16293 kernel: kernel update RHEL9.0 [5.14.0-70.30.1.el9_0]

Update RHEL9.0 kernel to 5.14.0-70.30.1.el9_0 for Lustre client.

Test-Parameters: trivial clientdistro=el9.0 \
env=SANITY_EXCEPT="130 244a" testlist=sanity

Signed-off-by: Jian Yu <yujian@whamcloud.com>
Change-Id: Ide942f88242c80af1e103b226b65cfbce94bfb57
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49044
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
23 months agoLU-16290 lnet: Signal completion on ping send failure 20/49020/3
Chris Horn [Tue, 1 Nov 2022 20:33:18 +0000 (14:33 -0600)]
LU-16290 lnet: Signal completion on ping send failure

Call complete() on the ping_data::completion if we get
LNET_EVENT_SEND with non-zero status. Otherwise the thread which
issued the ping is stuck waiting for the full ping timeout.

A pd_unlinked member is added to struct ping_data to indicate whether
the associated MD has been unlinked. This is checked by lnet_ping() to
determine whether it needs to explicitly called LNetMDUnlink().

Lastly, in cases where we do not receive a reply, we now return the
value of pd.rc, if it is non-zero, rather than -EIO. This can provide
more information about the underlying ping failure.

HPE-bug-id: LUS-11317
Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I1bc573cf7397e319993fa8aabb31c5f3b59768e7
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49020
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
23 months agoLU-16046 ldlm: group lock unlock fix 08/49008/3
Vitaly Fertman [Thu, 27 Oct 2022 19:54:18 +0000 (22:54 +0300)]
LU-16046 ldlm: group lock unlock fix

The original LU-9964 fix had a problem because with many pages in
memory grouplock unlock takes 10+ seconds just to discard them.

The current patch makes grouplock unlock thread to be not atomic, but
makes a new grouplock enqueue to wait until previous CBPENDING lock
gets destroyed.

HPE-bug-id: LUS-10644
Signed-off-by: Vitaly Fertman <vitaly.fertman@hpe.com>
Change-Id: I7798138b953320c477ce60c4e34eac40ada95a69
Reviewed-on: https://es-gerrit.dev.cray.com/161411
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Tested-by: Alexander Lezhoev <alexander.lezhoev@hpe.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49008
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexander <alexander.boyko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
23 months agoLU-16282 lnet: fix debug message in lnet_discovery_event_reply 97/48997/2
Serguei Smirnov [Mon, 31 Oct 2022 22:46:04 +0000 (15:46 -0700)]
LU-16282 lnet: fix debug message in lnet_discovery_event_reply

The message in lnet_discovery_event_reply currently says
"Peer X has discovery disabled" even though the same path
may be taken if discovery is disabled locally.
Change the debug message to indicate whether discovery is
disabled on the peer side or locally.

Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: I9c8be2286693c2bfc3f8cf67b6f3b8ab26f8258b
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48997
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
23 months agoLU-16281 clio: append to non-existent component 94/48994/2
Vitaly Fertman [Tue, 5 Jul 2022 21:00:58 +0000 (00:00 +0300)]
LU-16281 clio: append to non-existent component

should return an error, but it fails now with a BUG below
because @rc of lov_io_layout_at() is not checked for < 0

BUG: unable to handle kernel paging request at ffff99d3c2f74030
    Call Trace:
      lov_stripe_number+0x19/0x40 [lov]
      lov_page_init_composite+0x103/0x5f0 [lov]
      ? kmem_cache_alloc+0x12e/0x270
      cl_page_alloc+0x19f/0x660 [obdclass]
      cl_page_find+0x1a0/0x250 [obdclass]
      ll_write_begin+0x1f7/0xfb0 [lustre]

HPE-bug-id: LUS-11075
Signed-off-by: Vitaly Fertman <vitaly.fertman@hpe.com>
Change-Id: I4371f56cd9cdb3429d52a283831fb0a768e5c9c3
Reviewed-on: https://es-gerrit.dev.cray.com/161123
Tested-by: Jenkins Build User <nssreleng@cray.com>
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48994
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexander <alexander.boyko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
23 months agoLU-16269 kernel: kernel update RHEL8.6 [4.18.0-372.32.1.el8_6] 69/48969/4
Jian Yu [Thu, 3 Nov 2022 19:25:07 +0000 (12:25 -0700)]
LU-16269 kernel: kernel update RHEL8.6 [4.18.0-372.32.1.el8_6]

Update RHEL8.6 kernel to 4.18.0-372.32.1.el8_6.

Test-Parameters: trivial fstype=ldiskfs \
clientdistro=el8.6 serverdistro=el8.6 testlist=sanity

Test-Parameters: trivial fstype=zfs \
clientdistro=el8.6 serverdistro=el8.6 testlist=sanity

Change-Id: I5576180ddf10ed2b0a5e2ef85b58fef993de65a4
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48969
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
23 months agoLU-16167 obdclass: fix lctl llog_print with skipped records 86/48586/8
Etienne AUJAMES [Mon, 19 Sep 2022 10:23:47 +0000 (12:23 +0200)]
LU-16167 obdclass: fix lctl llog_print with skipped records

The 2a5b50d ignores the skipped records in configuration llog.
But if ioctl OBD_IOC_LLOG_PRINT return 0 record to display,
jt_llog_print_iter() will stop the processing and ignore the
non-skipped records at the end of the llog.

This patch returns to user space if the last index processed
(by llog_print_cb) is the last of llog file. If true,
jt_llog_print_iter() stops the processing.

Add regression test "conf-sanity test_123ai" for this issue.

Fixes: 2a5b50d ("LU-15142 lctl: fixes for set_param -P and llog_print")
Test-Parameters: testlist=conf-sanity env=ONLY=123ai,SLOW=yes,ONLY_REPEAT=10
Signed-off-by: Etienne AUJAMES <etienne.aujames@cea.fr>
Change-Id: I78395268c57555e4fd2a4048ccf5b6132ca2877f
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48586
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexander <alexander.boyko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
23 months agoLU-15388 osd-ldiskfs: wrong dot/dotdot FID for local agent 04/45904/8
Li Xi [Tue, 21 Dec 2021 10:24:30 +0000 (18:24 +0800)]
LU-15388 osd-ldiskfs: wrong dot/dotdot FID for local agent

Wrong FIDs are passed into osd_add_dot_dotdot_internal() in
osd_create_local_agent_inode(). Local agent inode is created to
satisfy e2fsck, and these two FIDs are not used anywhere, which
won't cause any known issue.

Signed-off-by: Li Xi <lixi@ddn.com>
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Ife39d539921a37994f9c6046ae066e1927154136
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/45904
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
23 months agoLU-15058 libcfs: introduce genradix support 90/45890/18
James Simmons [Wed, 2 Nov 2022 16:50:34 +0000 (12:50 -0400)]
LU-15058 libcfs: introduce genradix support

For a long time it has been known that vmalloc allocation have
a performance penalty. So another solution was introduced using a
generic radix tree to manage page size allocations. This new
functionality has huge potential to give us large performance gains.
Also this new API has an advantage over vmalloc for the case of not
knowing ahead of time how much to allocate. This is the case of a
Netlink data stream that is sending a array of data. We can use this
for the case of needing to grow an array instead of guessing how big
the data is coming.

Test-Parameters: trivial
Change-Id: I2d7b78569defd651fbb75bfbd82e0d805aff436e
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/45890
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
23 months agoLU-8915 lnet: migrate LNet selftest session handling to Netlink 98/43298/24
James Simmons [Mon, 17 Oct 2022 14:37:46 +0000 (10:37 -0400)]
LU-8915 lnet: migrate LNet selftest session handling to Netlink

The currently LNet selftest ioctl interface has a few issues which
can be resolved using Netlink. The first is the current API using
struct list_head is disliked by the Linux VFS maintainers. While
we technically don't need to use the struct list_head directly
its still confusing and passing pointers from userland to kernel
space is also frowned on.

Second issue that is exposed with debug kernels is that ioctl
handling done with the lstcon_ioctl_handler can easily end up
in a might_sleep state.

The new Netlink work is also needed for the IPv6 support. Update
the session handling to work with large NIDs. Internally use
struct lst_session_id which supports large NIDs instead of
struct lst_sid.

Lastly we have been wanting YAMl handling with LNet selftest
(LU-10975) which comes naturally with this work.

Test-Parameters: trivial testlist=lnet-selftest
Change-Id: I6ac56e51c6bfcf651a28388129020249166a7034
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/43298
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>