Whamcloud - gitweb
fs/lustre-release.git
4 months agoLU-17892 lnet: Fix export with empty nets 79/55279/3
Chris Horn [Fri, 24 May 2024 17:32:17 +0000 (11:32 -0600)]
LU-17892 lnet: Fix export with empty nets

lnetctl export --backup should not print an error when there haven't
been any NIs added.

Test-Parameters: trivial
Fixes: 8f8f6e2f36 ("LU-10003 lnet: use Netlink to support old and new NI APIs.")
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Id9e916d2d70d5dc01442e24449cb787c5a6a7e1d
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55279
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 months agoLU-17416 utils: option for lctl get_param to skip links 36/55236/12
Frederick Dilger [Mon, 27 May 2024 21:38:51 +0000 (17:38 -0400)]
LU-17416 utils: option for lctl get_param to skip links

Added new --links and --no-links options for 'lctl get_param' and
'lctl list_param' to avoid following symlinks. Useful when combined
with a command like "lctl get_param -R '*'" which can dump a lot of
duplicate data due to symlinks under lov.*.target_obds and
lmv.*.target_obds pointing back to their respective osc.* and mdc.*
trees. By default --links is enabled to for lctl get_param to
continue to operate as it did before this patch.

Additionally, long options have been added for all previous options
in {list, get, set}_param to update command options to current
standards. This will also facilitate adding new options in the future
as well as code maintenance and readability.

Signed-off-by: Frederick Dilger <fdilger@whamcloud.com>
Change-Id: I24115835f5045623f78fa2045dc3e0ce0b795316
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55236
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 months agoLU-12706 tests: sanity-quota 4a sync timeout fix 16/55216/3
Sergey Cheremencev [Mon, 27 May 2024 22:49:24 +0000 (01:49 +0300)]
LU-12706 tests: sanity-quota 4a sync timeout fix

Don't sync all OSTs in a system - this might take
too much time. Instead, set striping only on OST0000
and sync only MDTs and OST0000. This fix is against
the following failure:

  FAIL: Passed grace time 20, 15669105271566910563

Signed-off-by: Sergey Cheremencev <scherementsev@ddn.com>
Change-Id: I525e6c73c6d14a126a2bde7d92bc28f11f3c78c8
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55216
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 months agoLU-17870 lu: delete lu_ref forever 82/55182/5
Timothy Day [Thu, 23 May 2024 03:28:51 +0000 (03:28 +0000)]
LU-17870 lu: delete lu_ref forever

Remove lu_ref infrastructure forever. This debugging infrastructure
is often broken and doesn't coorespond with the actual reference
counting used to manage object lifetimes. Hence, when a real bug
is encountered (i.e. some thread isn't releasing a reference),
this code (assuming it happens to be working) can't actually help
debug the issue.

Recently, I was debugging an issue with ld_ref counting. Naturally,
I turned to the debugging code available already in Lustre. I was
dismayed to find that it was more broken than the code I was already
debugging. Rather than debug the debugging code, I think it's
better to cast it away.

Most compelling, the builds used by Maloo and Gerrit Janitor don't
enable this feature. So it can be broken for long periods of time
without anyone noticing.

Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I8eaa6d8518f642adebb612ec3fa780b584366f4f
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55182
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 months agoLU-17872 ldlm: switch to read_positive in reclaim_full 41/55141/7
Patrick Farrell [Wed, 29 May 2024 20:41:54 +0000 (16:41 -0400)]
LU-17872 ldlm: switch to read_positive in reclaim_full

Checking reclaim full for every lock request is expensive;
it requires taking a global spinlock and can completely
clog the MDS CPU on larger systems.

If we switch to read_positive rather than sum_positive for
our counter read, we avoid this spinlock at the cost of
being off by as much as NR_CPU*32.

Since the counter is for hundreds of thousands to millions
of items and just triggers memory reclaim, this level of
error is completely fine.

This resolves the contention issue, on an OCI system with
384 cores, here's our mdtest comparison:

Operation           | Without Patch | With Patch  | %Change
---------------------|---------------|-------------|-------
Directory creation  | 69481.994     | 64373.060   | -7%
Directory stat      | 87942.757     | 274670.454  | 212%
Directory rename    | 78127.922     | 92592.239   | 19%
Directory removal   | 69901.490     | 89560.415   | 28%
File creation       | 62789.774     | 107294.450  | 71%
File stat           | 88039.061     | 480469.711  | 446%
File read           | 82192.370     | 151117.380  | 84%
File removal        | 146690.828    | 127589.655  | -13%
Tree creation       | 46.549        | 56.992      | 22%
Tree removal        | 51.531        | 53.967      | 5%

Note the *446%* improvement in stat and the 70-80% in
file creation and read.

Note this issue is likely much worse on systems with higher
core counts since the cost of summing the counter scales
with the number of CPUs.  This may be why this has not been
seen before.

Signed-off-by: Patrick Farrell <patrick.farrell@oracle.com>
Change-Id: I01a39abf5e6f0829156b413b1f44001e2c504be2
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55141
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: wangdi <di.d.wang@oracle.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 months agoLU-17844 ptlrpc: remove all LCONSOLE_ERROR_MSG() 36/55136/3
Timothy Day [Fri, 17 May 2024 00:30:57 +0000 (00:30 +0000)]
LU-17844 ptlrpc: remove all LCONSOLE_ERROR_MSG()

These magic numbers aren't so magical anymore. Just
use LCONSOLE_ERROR().

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Ica8ebd06b7e8ea8c7eb00181ab3c0b06de2481ca
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55136
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 months agoLU-17844 obdclass: remove all LCONSOLE_ERROR_MSG() 34/55134/3
Timothy Day [Fri, 17 May 2024 00:15:25 +0000 (00:15 +0000)]
LU-17844 obdclass: remove all LCONSOLE_ERROR_MSG()

These magic numbers aren't so magical anymore. Just
use LCONSOLE_ERROR().

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Ib5310f65cda7a3537837e8a38801e6d1771d4759
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55134
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 months agoLU-10026 csdc: reserve OBD_BRW_SPECULATIVE_COMPR flag 04/55104/5
Artem Blagodarenko [Tue, 14 May 2024 12:25:19 +0000 (08:25 -0400)]
LU-10026 csdc: reserve OBD_BRW_SPECULATIVE_COMPR flag

DIO does not set KMS like buffered IO, and the KMS it sets
is not safe.  So this requires special handling for last
chunk compression.

Since we can't know when we're doing the last chunk with
DIO, the solution is as follows:
If a DIO write is chunk aligned at the start but not a full
chunk, we compress it but mark it 'speculative'.  Then the
server double checks that the write is beyond current file
size, and if it's not, it will ask the client to do a
resend, and the client will send the data back
uncompressed.

This makes it reasonable to fully enable DIO to compressed
files - previously we converted unaligned DIO to buffered
IO.

This patch reserves OBD_BRW_SPECULATIVE_COMPR flag.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Signed-off-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Change-Id: I679bc103bd2862115d94286e7c2ed43e1580b29e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55104
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Patrick Farrell <patrick.farrell@oracle.com>
4 months agoLU-17832 gni: build should not collapse extra symbols 52/55052/2
Shaun Tancheff [Wed, 8 May 2024 15:38:43 +0000 (22:38 +0700)]
LU-17832 gni: build should not collapse extra symbols

cray-obs spec files (ari,gem,dmp) define:
   KBUILD_EXTRA_SYMBOLS and GNICPPFLAGS

When building kgnilnd the environment variable needs to be passed
through to make.

HPE-bug-id: LUS-12269
Test-Parameters: trivial
Fixes: 8b1d2a72f1 ("LU-16967 build: Add in-kernel-ko2iblnd driver")
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Icc7ac33138300bf3836082a014daf580a1632436
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55052
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Patrick Farrell <patrick.farrell@oracle.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 months agoLU-17708 lnet: update kfi and o2ib to handle NULL lnet_msg 77/54677/7
Shaun Tancheff [Sun, 9 Jun 2024 23:59:43 +0000 (06:59 +0700)]
LU-17708 lnet: update kfi and o2ib to handle NULL lnet_msg

Handle the handle NULL lnet_msg cases in the lnd_recf() handlers
of kfi and o2ib lnds.

HPE-bug-id: LUS-12245
Test-Parameters: trivial
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Ia0a8957653353380ef77c9686a020284db0e460c
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54677
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 months agoLU-17573 lov: change default object size. 37/54137/3
Alexey Lyashkov [Thu, 22 Feb 2024 06:38:03 +0000 (09:38 +0300)]
LU-17573 lov: change default object size.

OST don't able to use indirects for long time,
let's switch a object size to extent based.

Test-Parameters: trivial
HPe-bug-id: LUS-11428
Signed-off-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Change-Id: I9759fc7122c41075ebc35d52ade342c37706b041
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54137
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 months agoLU-17460 lnet: support IPv6 for link state 61/53761/11
James Simmons [Tue, 23 Apr 2024 13:30:04 +0000 (09:30 -0400)]
LU-17460 lnet: support IPv6 for link state

The LNet layer montiors the state of the underlying TCP
connection. Currently it only supports network interfaces
setup with IPv4 addresses. Update to handle IPv6 setups.

Test-Parameters: trivial testlist=sanity-lnet
Change-Id: I249e9591d5f637112f6bd862cd0f928a555af229
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53761
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 months agoLU-15553 test: mkdir_on_mdt0 in sanity-krb5 53/51653/8
Lai Siyao [Sat, 8 Jul 2023 22:32:29 +0000 (18:32 -0400)]
LU-15553 test: mkdir_on_mdt0 in sanity-krb5

test_8 in requires test dir created on MDT0, replace mkdir
with mkdir_on_mdt0. It's found by script:
grep -C 10 -n "do_facet.*SINGLEMDS" lustre/tests/*.sh | grep -w mkdir

Fixes: b9c4dc3c33 ("LU-14792 llite: enable filesystem-wide default LMV")
Test-Parameters: trivial mdscount=2 mdtcount=4 testlist=sanity-krb5,sanity-krb5,sanity-krb5
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I09b1aec95bff84622accea91650887dffc1245f3
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51653
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 months agoLU-11085 lustre: remove interval-tree code 66/49166/11
Mr NeilBrown [Wed, 5 Jun 2024 19:59:31 +0000 (15:59 -0400)]
LU-11085 lustre: remove interval-tree code

The lustre interval tree is no longer used.  All users have been
changed to use the Linux interval tree implementation.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I7aaa79ebb5e672657dd96c79bd8f85cdf3ce5438
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49166
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 months agoLU-17276 ldlm: use interval tree for searching in flock 51/53951/14
Mr NeilBrown [Fri, 26 Apr 2024 14:40:20 +0000 (10:40 -0400)]
LU-17276 ldlm: use interval tree for searching in flock

This patch converts ldlm_process_flock_lock() to use the new interval
tree to find flock locks more efficiently.

Previously all locks the the same owner were adjacent in the
lr_granted list.  This was used for the second stage of merging
overlapping locks once it was confirmed that there were no conflicts.
Now instead we build up a temporary list of locks in the target range
that have the same owner, and use that for the second stage.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: I0a4f1e833d8db36827c318a020de564a78b0adb5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53951
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andriy Skulysh <andriy.skulysh@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 months agoLU-17276 ldlm: convert flock locks to linux interval tree. 50/53950/17
Mr NeilBrown [Wed, 7 Feb 2024 05:21:48 +0000 (16:21 +1100)]
LU-17276 ldlm: convert flock locks to linux interval tree.

Convert to using the linux interval tree code.  When the range of a
lock is changed as part of adding or removing an overlapping range,
the lock is removed and readded to the tree.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I747b625af1e83210b12daac5102600a3de173a2a
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53950
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
4 months agoLU-6142 utils: Fix style issues for lst.c 89/55289/6
Arshad Hussain [Mon, 3 Jun 2024 05:59:53 +0000 (01:59 -0400)]
LU-6142 utils: Fix style issues for lst.c

This patch fixes issues reported by checkpatch
for file lnet/utils/lst.c

Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I9a10254fe3da725fcc88f656f944cfc2597ed8cc
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55289
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 months agoLU-6142 o2ib: SPDX for Infiniband driver 87/55287/2
Timothy Day [Sun, 2 Jun 2024 23:15:53 +0000 (23:15 +0000)]
LU-6142 o2ib: SPDX for Infiniband driver

Convert from verbose license text to SDPX.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I3205f7e0e2e4bbb8609320e32f0975f82882f5dc
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55287
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 months agoLU-6142 mgs: SPDX for management server 86/55286/3
Timothy Day [Sun, 2 Jun 2024 23:02:07 +0000 (23:02 +0000)]
LU-6142 mgs: SPDX for management server

Convert from verbose license text to SDPX.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I41c91276789bbadf9967ee18033620431d7561a0
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55286
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 months agoLU-17070 lov: retry layout refresh if got old layouts 61/55061/6
Bobi Jam [Thu, 9 May 2024 09:23:37 +0000 (17:23 +0800)]
LU-17070 lov: retry layout refresh if got old layouts

lov_layout_change() would not apply old layouts which can get through
when MDS doesn't take layout lock, this patch would retry getting
the layout and re-apply the layout again for once.

Fixes: 13557aa869 ("LU-15300 mdt: refresh LOVEA with LL granted")
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Id29ec4ada85060a20f730f92a6a9409d755a56a1
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55061
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 months agoLU-17809 osp: make disconnect asynchronous 95/54995/7
Alexander Boyko [Sat, 20 Apr 2024 22:02:54 +0000 (18:02 -0400)]
LU-17809 osp: make disconnect asynchronous

MDT could have many osp devices. During umount there is a problem
of casscading timeouts of disconnect request. It could lead to
unpredictable large umount time.

This patch adds ability of parallel disconnect for OSP devices.
During LCFG_PRECLEANUP osp_disconnect() sends disconnects requests.
And osp_shutdown() waits it. So casscading timeouts were changed
to a single request wait.

Don't drop obd_force flag from upper layers.

Adds replay-single test 201, it simulates delays of OSP disconnects.
This leads to a high cumulative umount time.

HPE-bug-id: LUS-12251
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: Id788b22c494147bdc7f0d36968629e7b7f660e01
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54995
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
4 months agoLU-9680 lnet: Convert lnetctl debug recovery to netlink 34/53734/12
Chris Horn [Wed, 5 Jun 2024 13:07:40 +0000 (09:07 -0400)]
LU-9680 lnet: Convert lnetctl debug recovery to netlink

Convert the lnetctl debug recovery command to netlink

Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Ic44cd93708b2e753e99901ba10334be17250a23c
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53734
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
4 months agoLU-12998 lod: statfs upon nocreate check 37/53437/3
Lai Siyao [Tue, 12 Dec 2023 19:50:33 +0000 (14:50 -0500)]
LU-12998 lod: statfs upon nocreate check

lod_declare_create() checks whether directory create target MDT is
current MDT, this may happen if nocreate is set on some MDT. Upon
such mismatch, call dt_statfs() to fetch latest statfs to know
whether nocreate is set.

lmv_create() will choose another MDT if target MDT is set with
nocreate, but in case the flag is cleared, call obd_statfs() to fetch
cached statfs and check again.

Fixes: 1dbcd0bab88 (LU-12998 mds: add no_create parameter to stop creates)
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I2575d15416968554c66d40dcf18ecca2a06c7a37
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53437
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 months agoLU-17402 kernel: update dotdot patch path for RHEL 8.10 81/55381/2
Jian Yu [Mon, 10 Jun 2024 18:06:53 +0000 (11:06 -0700)]
LU-17402 kernel: update dotdot patch path for RHEL 8.10

After commit 0536b2a landed, the patch path for
ext4-hash-indexed-dir-dotdot-update.patch was
changed from ubuntu18/ to rhel8.7/.

Test-Parameters: trivial fstype=ldiskfs mdtcount=4 mdscount=2 \
  clientdistro=el8.10 serverdistro=el8.10 testlist=sanity

Test-Parameters: optional clientdistro=el8.10 serverdistro=el8.10 \
  testgroup=full-part-1

Test-Parameters: optional clientdistro=el8.10 serverdistro=el8.10 \
  testgroup=full-part-2

Test-Parameters: optional clientdistro=el8.10 serverdistro=el8.10 \
  testgroup=full-part-3

Change-Id: I323fe06cfd125ad57959782bb33a2af81b705788
Fixes: 0536b2a ("LU-17711 osd-ldiskfs: do not delete dotdot during rename")
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55381
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-17887 obd: do not update obd_memory from RCU 63/55263/2
Bruno Faccini [Thu, 30 May 2024 16:39:37 +0000 (18:39 +0200)]
LU-17887 obd: do not update obd_memory from RCU

OBD_FREE_PRE() should not be run from an RCU
callback as the obdclass module may have been
unloaded during the RCU grace period.

Signed-off-by: Bruno Faccini <bfaccini@nvidia.com>
Change-Id: I6f663b2aed2e60c15f2a1b9755b2c4050bd91ce2
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55263
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
5 months agoLU-17000 utils: Initialize var 'gw' and 'net' before using 16/55316/2
Arshad Hussain [Wed, 5 Jun 2024 06:46:13 +0000 (02:46 -0400)]
LU-17000 utils: Initialize var 'gw' and 'net' before using

Although this is called at "sequence end" and most
likely 'gw' and 'net' will be populated by then. It
is still good to be defensive and make them initialize

Test-Parameters: trivial testlist=sanity-lnet
CoverityID: 410246 ("Uninitialized scalar variable")
CoverityID: 410240 ("Uninitialized scalar variable")
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I2f47df431eea0e0344043ac22806865e87435c6e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55316
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-17899 gss: lsvcgss service fix 93/55293/2
Sebastien Buisson [Mon, 3 Jun 2024 11:52:20 +0000 (13:52 +0200)]
LU-17899 gss: lsvcgss service fix

The lsvcgss service can fail to start if the daemon is invoked with
the '-k' option whereas no proper Kerberos configuration is in place
on the server. The daemon should ignore the '-k' option is such case
and try to start the other provided modes if any (SSK, Null).
And in case the daemon is started with the '-s' option (SSK), it
spawns a temporary additional thread to compute the number of rounds
used for Miller-Rabin prime testing. So the lsvcgss_sysd script should
support that.

Fixes: c6878334a1 ("LU-17741 gss: fix lsvcgss service for systemd")
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Iba632bd0ea9696ccea52bff5982a4d4e490597a7
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55293
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-17000 obdclass: Initialize var 'bufsize' before using 91/55291/2
Arshad Hussain [Mon, 3 Jun 2024 10:01:04 +0000 (06:01 -0400)]
LU-17000 obdclass: Initialize var 'bufsize' before using

This patch initialize variable bufsize before using. This is because
bufsize is left uninitialized if obd_page_dif_generate_buffer() calls
fails. Once bufsize is initialize calling cfs_crypto_hash_final()
becomes safe.

Test-Parameters: trivial
CoverityID: 397224 ("Uninitialized scalar variable")
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I933cc3746d107acb308bd0060b7648a82410711c
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55291
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-17844 oss: remove all LCONSOLE_ERROR_MSG() 81/55281/2
Timothy Day [Sat, 1 Jun 2024 04:32:01 +0000 (04:32 +0000)]
LU-17844 oss: remove all LCONSOLE_ERROR_MSG()

These magic numbers aren't so magical anymore. Just
use LCONSOLE_ERROR().

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Id7ae3b50478c434203adfb375cb31f158d4b29d4
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55281
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-16946 utils: allow lfs find time within a range 71/55271/6
Frederick Dilger [Thu, 30 May 2024 22:00:57 +0000 (18:00 -0400)]
LU-16946 utils: allow lfs find time within a range

If multiple times are specified on the command-line like:

        lfs find -atime +60 -atime -90 ...

use those times as the upper and lower bounds of the margin.
This makes it easier to find files that were created within
a specific range of dates.

While working on this patch I noticed that that margin
bounds are a little odd; using the range:

(limit - margin, limit]

instead of what I intuitively thought,

(limit - margin, limit + margin)

The logic behind this is unknown to me, but it can be found
'liblustreapi.c' in the method 'find_value_cmp()'.
However, for the time being, when using a time range, it will
simply shift the limit to the largest of the two and have
the margin cover the difference.

Signed-off-by: Frederick Dilger <fdilger@whamcloud.com>
Change-Id: I2e5b856396472eab91e1d2c3214f304010601a41
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55271
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-7892 utils: removed deprecated create_iam.c 65/55265/4
Maximilian Dilger [Thu, 30 May 2024 16:18:48 +0000 (12:18 -0400)]
LU-7892 utils: removed deprecated create_iam.c

Removed create_iam.c and all found references. The OI is now created
by osd-ldiskfs, so it is safe to remove create_iam.c

Signed-off-by: Max Dilger <mdilger@whamcloud.com>
Change-Id: Ibbc89ecfbfbebf6f61d93d4a784b509977ccb3c2
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55265
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
5 months agoLU-17878 build: compatibility updates for kernel 6.9 01/55201/3
Shaun Tancheff [Sat, 25 May 2024 23:20:38 +0000 (17:20 -0600)]
LU-17878 build: compatibility updates for kernel 6.9

Linux v6.8-2-ga8922f79671f
  ceph: remove SLAB_MEM_SPREAD flag usage

Provide a replacement for older kernels when SLAB_MEM_SPREAD
is not defined.

Linux v6.8-rc1-47-gc69ff4071935
  filelock: split leases out of struct file_lock

Provide abstractions for:
  flc_type, flc_pid, flc_file, flc_flags, and flc_owner

Test-Parameters: trivial
HPE-bug-id: LUS-12363
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Ide457ba29fc2d3537f074fe9a66cf0c8567f7621
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55201
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-17516 utils: new --mdt and --ost options for lfs df 56/55156/7
Frederick Dilger [Fri, 17 May 2024 20:19:17 +0000 (14:19 -0600)]
LU-17516 utils: new --mdt and --ost options for lfs df

Added [--mdt | -m] and [--ost | -o] options for 'lfs df' to print
only usage of the respective MDT or OST devices in mntdf(). If both
"--mdt" and "--ost" are specified it will show both types of devices
which is identical to having neither specified.

Signed-off-by: Frederick Diger <fdilger@whamcloud.com>
Change-Id: I196b7c9c0c385850372331587936fa5cf6b71d93
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55156
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-17000 utils: Use correct printf specifier for lustre_rsync.c 54/55154/2
Arshad Hussain [Mon, 20 May 2024 07:17:58 +0000 (03:17 -0400)]
LU-17000 utils: Use correct printf specifier for lustre_c

In lr_copy_xattr() use "%s" for "char *" and "%zd"
for "ssize_t" data type.

Change 'struct lr_info' fields xsize and xvsize from size_t
to ssize_t as extended attribute functions can return
negative values

Test-Parameters: trivial testlist=lustre-rsync-test
CoverityID: 397866 ("Invalid type in argument to printf format specifier")
CoverityID: 397573 ("Invalid type in argument to printf format specifier")
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: Idd7c4f81c1a1751c595c86b10493aab6f959059f
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55154
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-17854 lnet: Router should not drop msg past deadline 31/55131/7
Chris Horn [Wed, 22 May 2024 19:34:25 +0000 (13:34 -0600)]
LU-17854 lnet: Router should not drop msg past deadline

It has been observed that messages can become queued in LNet on
router nodes so long that they exceed their message deadlines. These
messages will currently be dropped, even if the target peer is alive.
PtlRPC adaptive timeouts can dynamically increase to account for the
increased network latency, but if the RPCs are dropped on routers then
these operations will fail. Routers should only drop messages when
the router peer health feature determines the target is down. This
gives Lustre the best chance to complete operations during periods of
increased network latency.

A bug in sanity-lnet/do_route_del() is fixed. The lnetctl route show
output was stored in a variable named "output", but the variable
"lnetctl_text" was checked to determine if the route needed to be
deleted.

test_102() was also modified to call cleanup_router_test(). A
comment there indicated it was not needed because the routes were
already deleted, but cleanup_router_test() does more than just
delete the route entries. Namely, unloading modules on all nodes.

Test-Parameters: trivial testlist=sanity-lnet
HPE-bug-id: LUS-12153
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I1e6966d4a3a2b10dd7b99620774d5c32b7eccd1f
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55131
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-17000 ptlrpc: Use matching deallocator for cfs_expr_list_values 43/55043/3
Arshad Hussain [Wed, 8 May 2024 10:05:30 +0000 (15:35 +0530)]
LU-17000 ptlrpc: Use matching deallocator for cfs_expr_list_values

For cfs_expr_list_values() allocator use cfs_expr_list_values_free()
as deallocator.

Coverity actually complained that kfree() should not
be called but free() should be called instead. It looks
like coverity is checking under file libcfs/libcfs/util/string.c
function cfs_expr_list_values which is calling calloc().
This cannot be correct under ptlrpc

Test-Parameters: trivial
CoverityID: 424700 ("Incorrect deallocator used")
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: Idfbb6be585b35f87a59ae92d0cffa85c8dff623a
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55043
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-17566 mdt: improve new_init_ucred() for refactoring 25/55025/10
Aurelien Degremont [Wed, 6 Mar 2024 14:04:41 +0000 (15:04 +0100)]
LU-17566 mdt: improve new_init_ucred() for refactoring

In order to merge new_init_ucred() and old_init_ucred()
code eventually, move new_init_ucred() code around
for it to look even closer to old_init_ucred().

- Fill generic ucred fields at the beginning (similar to
what old_init_ucred() is doing.
- Move code for the bottom part to be closer to
old_init_ucred_common().

This code path is not used on most of lustre deployments,
so I'm enabling kerberos testing to ensure some tests
will go through this code path.

Test-Parameters: kerberos=true testlist=sanity-krb5

Change-Id: I113fca6a104c1db66d9e0defd6fd91e378d7208c
Signed-off-by: Aurelien Degremont <adegremont@nvidia.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55025
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-13577 wbc: theoretical link with missing destination 16/55016/2
Shaun Tancheff [Mon, 6 May 2024 06:14:43 +0000 (13:14 +0700)]
LU-13577 wbc: theoretical link with missing destination

While unlikely it would be possible so return -EINVAL here

CoverityID: 425353 ("Null pointer dereferences (FORWARD NULL)")

Fixes: 668dfb53de ("LU-13577 wbc: reimplement mkdir() by using intent lock")
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I2a41d1a37820a3bc7b06dff42a4cc09386a88820
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55016
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-17773 lov: avoid partly outside array bounds build error 44/54944/3
Bobi Jam [Sat, 1 Jun 2024 18:24:08 +0000 (11:24 -0700)]
LU-17773 lov: avoid partly outside array bounds build error

Avoid "array subscript 'struct lov_stripe_md_entry[0]’ is partly
outside array bounds of â€˜struct lov_stripe_md_entry[0]’ error.
Otherwise an lsme holder will be allocated for invalid lmm magic.

Fixes: 902fe290 ("LU-17261 lov: ignore broken components")
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I5a403a0d230d2129e372fd8a22f58901cd0c1b68
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54944
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-17750 kernel: update SLES15 SP4 [5.14.21-150400.24.100.2] 23/54823/5
Jian Yu [Tue, 28 May 2024 21:00:31 +0000 (14:00 -0700)]
LU-17750 kernel: update SLES15 SP4 [5.14.21-150400.24.100.2]

Update SLES15 SP4 kernel to 5.14.21-150400.24.100.2 for Lustre client.

Test-Parameters: trivial

Change-Id: I401e97f602e6c8c62fac73e3603eb0226745bba1
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54823
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Peter Jones <pjones@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Alex Deiter
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-17883 kernel: update SLES15 SP5 [5.14.21-150500.55.65.1] 27/55227/3
Jian Yu [Tue, 28 May 2024 20:56:30 +0000 (13:56 -0700)]
LU-17883 kernel: update SLES15 SP5 [5.14.21-150500.55.65.1]

Update SLES15 SP5 kernel to 5.14.21-150500.55.65.1 for Lustre client.

Test-Parameters: trivial mdtcount=4 mdscount=2 \
  clientdistro=sles15sp5 testlist=sanity

Test-Parameters: optional clientdistro=sles15sp5 testgroup=full-part-1
Test-Parameters: optional clientdistro=sles15sp5 testgroup=full-part-2
Test-Parameters: optional clientdistro=sles15sp5 testgroup=full-part-3

Change-Id: Ie0601c190e52d6192bf389338be51c77db03a9c2
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55227
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Peter Jones <pjones@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
5 months agoLU-17402 kernel: RHEL 8.10 client and server support 00/54800/6
Jian Yu [Sat, 1 Jun 2024 17:58:32 +0000 (10:58 -0700)]
LU-17402 kernel: RHEL 8.10 client and server support

This patch makes changes to support RHEL 8.10 release
with kernel 4.18.0-553.el8_10 for Lustre client and server.

Test-Parameters: trivial fstype=ldiskfs mdtcount=4 mdscount=2 \
  clientdistro=el8.10 serverdistro=el8.9 testlist=sanity

Test-Parameters: trivial fstype=zfs mdtcount=4 mdscount=2 \
  clientdistro=el8.10 serverdistro=el8.9 testlist=sanity

Test-Parameters: trivial fstype=ldiskfs mdtcount=4 mdscount=2 \
  clientdistro=el8.9 serverdistro=el8.10 testlist=sanity

Test-Parameters: trivial fstype=zfs mdtcount=4 mdscount=2 \
  clientdistro=el8.9 serverdistro=el8.10 testlist=sanity

Test-Parameters: optional clientdistro=el8.10 serverdistro=el8.10 \
  testgroup=full-part-1

Test-Parameters: optional clientdistro=el8.10 serverdistro=el8.10 \
  testgroup=full-part-2

Test-Parameters: optional clientdistro=el8.10 serverdistro=el8.10 \
  testgroup=full-part-3

Change-Id: I0a9a262d13e0b0de3607da0982468fd8b5f6a7aa
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54800
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-17749 kernel: update RHEL 8.9 [4.18.0-513.24.1.el8_9] 21/54821/6
Jian Yu [Sat, 1 Jun 2024 17:53:36 +0000 (10:53 -0700)]
LU-17749 kernel: update RHEL 8.9 [4.18.0-513.24.1.el8_9]

Update RHEL 8.9 kernel to 4.18.0-513.24.1.el8_9.

Test-Parameters: trivial fstype=ldiskfs mdtcount=4 mdscount=2 \
  clientdistro=el8.9 serverdistro=el8.8 testlist=sanity

Test-Parameters: trivial fstype=zfs mdtcount=4 mdscount=2 \
  clientdistro=el8.9 serverdistro=el8.8 testlist=sanity

Test-Parameters: trivial fstype=ldiskfs mdtcount=4 mdscount=2 \
  clientdistro=el8.8 serverdistro=el8.9 testlist=sanity

Test-Parameters: trivial fstype=zfs mdtcount=4 mdscount=2 \
  clientdistro=el8.8 serverdistro=el8.9 testlist=sanity

Test-Parameters: optional clientdistro=el8.9 serverdistro=el8.9 \
  testgroup=full-part-1

Test-Parameters: optional clientdistro=el8.9 serverdistro=el8.9 \
  testgroup=full-part-2

Test-Parameters: optional clientdistro=el8.9 serverdistro=el8.9 \
  testgroup=full-part-3

Change-Id: I94b5a95e9e85f2f5e0cddb1dbb519ef92520ad0b
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54821
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Deiter
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
5 months agoLU-17641 kernel: update RHEL 9.3 [5.14.0-362.24.1.el9_3] 20/54820/6
Jian Yu [Sat, 1 Jun 2024 08:25:36 +0000 (01:25 -0700)]
LU-17641 kernel: update RHEL 9.3 [5.14.0-362.24.1.el9_3]

Update RHEL 9.3 kernel to 5.14.0-362.24.1.el9_3.

Test-Parameters: trivial fstype=ldiskfs mdtcount=4 mdscount=2 \
  clientdistro=el9.3 testlist=sanity

Test-Parameters: trivial fstype=zfs mdtcount=4 mdscount=2 \
  clientdistro=el9.3 testlist=sanity

Test-Parameters: trivial fstype=ldiskfs mdtcount=4 mdscount=2 \
  clientdistro=el9.2 serverdistro=el9.3 testlist=sanity

Test-Parameters: trivial fstype=zfs mdtcount=4 mdscount=2 \
  clientdistro=el9.2 serverdistro=el9.3 testlist=sanity

Test-Parameters: optional clientdistro=el9.3 serverdistro=el9.3 \
  testgroup=full-part-1

Test-Parameters: optional clientdistro=el9.3 serverdistro=el9.3 \
  testgroup=full-part-2

Test-Parameters: optional clientdistro=el9.3 serverdistro=el9.3 \
  testgroup=full-part-3

Change-Id: Ifafb3fbbfdfcd82506daed44d3601a0d4357331e
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54820
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Alex Deiter
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-17735 tests: extend health check to avoid spurious failures 65/54765/3
Srikanth Ramamurthy [Wed, 6 Oct 2021 19:06:25 +0000 (19:06 +0000)]
LU-17735 tests: extend health check to avoid spurious failures

Increase network check timeout to avoid spurious failures of tests
like sanity test 7b.  5 seconds is too short in certain environments.

Signed-off-by: Srikanth Ramamurthy <srramamu@microsoft.com>
Change-Id: Iea49f4f66efbfe0afa56ec81eadea6d792cab55f
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54765
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-17733 tests: sanity test_45 fix dirty count 63/54763/3
Cyrus Ramavarapu [Wed, 15 Nov 2023 14:30:37 +0000 (14:30 +0000)]
LU-17733 tests: sanity test_45 fix dirty count

Change test comparisons to ge (>=) instead of gt (>). A comment was
added identifying the race condition within the test related to the
completion of async io.  Also updated sync condition to check dirty
bytes after sync is equal to 0

Test-Parameters: testlist=sanity env=ONLY=45
Signed-off-by: Cyrus Ramavarapu <cramavarapu@microsoft.com>
Change-Id: I5826d82a16e20101a2eb8d415bdbde1b6bcc8d69
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54763
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-17423 ldiskfs: pass NULL to ext4_dir_rec_len for dot and dotdot 24/54724/4
Li Dongyang [Tue, 14 May 2024 02:26:02 +0000 (12:26 +1000)]
LU-17423 ldiskfs: pass NULL to ext4_dir_rec_len for dot and dotdot

For '.' and '..' we should pass NULL to EXT4_DIR_ENTRY_LEN/
EXT4_DIR_REC_LEN/ext4_dir_rec_len() as those entries do not
use extra fields to store the extra hash for casefolded+fscrypt
case.

This has no impact yet as we do not support casefolded+fscrypt
right now.

Change-Id: I6df56aefdf6440e4c03088fec7a4d38a523cf8dc
Test-Parameters: trivial
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54724
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-17709 build: Fix line breaks in CHECK_SYMBOLS 80/54680/2
Alexander Lezhoev [Fri, 5 Apr 2024 09:43:47 +0000 (12:43 +0300)]
LU-17709 build: Fix line breaks in CHECK_SYMBOLS

CHECK_SYMBOLS=$(find ${INT_O2IBPATH}* -name Module.symvers)
contains newlines if there is more than one Module.symvers
and cause a build failure

Test-Parameters: trivial
Fixes: 8b1d2a72f1 ("LU-16967 build: Add in-kernel-ko2iblnd driver")
HPE-bug-id: LUS-12229
Signed-off-by: Alexander Lezhoev <alexander.lezhoev@hpe.com>
Change-Id: I980630dc646ec837e18b0e123999027a23aaa2d6
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54680
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-17431 nodemap: determine if nodemap is currently loading 01/55001/2
Sebastien Buisson [Thu, 2 May 2024 09:50:34 +0000 (11:50 +0200)]
LU-17431 nodemap: determine if nodemap is currently loading

To control operations on dynamic nodemaps, we need to know if the
current nodemap configuration is in the process of being loaded.
Only once it is fully loaded the nodemap_mgs() function returns true.
So change semantic of global variable nodemap_config_loaded, to be:
* 0: not loaded yet
* 1: successfully loaded
* -1: loading in progress

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: If5e52e924415f644d0134f4093c2405df0887f87
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55001
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-17393 osd: recreate LAST_ID for local seq 98/53898/10
Hongchao Zhang [Tue, 19 Mar 2024 04:19:42 +0000 (12:19 +0800)]
LU-17393 osd: recreate LAST_ID for local seq

The file at /O/seq/LAST_ID in the sequences used by local storage
is not fixed by LFSCK currently, this patch addes the support to
scan the local storage sequences under root object director "/O"
and recreate or fix it accordingly.

Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Change-Id: I840a0fcfa207528c5a0e9f0c87df8b4745bba671
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53898
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
5 months agoLU-17064 build: check for Build-Parameters in commit 48/52448/19
Andreas Dilger [Wed, 20 Sep 2023 23:39:09 +0000 (17:39 -0600)]
LU-17064 build: check for Build-Parameters in commit

Check if the commit message contains any "Build-Parameters:" lines
embedded in the commit message, like "clientdistro=el9.2" to limit
builds to only the specified distros/arches.  Expect one directive
(for either client or server build) at a time.  The arch is optional,
in which case all architectures are built.

Also accept the "ignore" keyword in the Build-Parameters: line as
well as Test-Parameters: lines, since it is really a build directive.

Build-Parameters: clientdistro=el9.3 clientarch=aarch64
Build-Parameters: distro=el9.3 arch=x86_64
Build-Parameters: distro=el8.9 arch=x86_64
Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I6fe1c59748e287b671a21cc3f3fdb0c4473ebbe5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52448
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Patrick Farrell <patrick.farrell@oracle.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-15963 osd-zfs: use contiguous chunk to grow blocksize 68/47768/47
Alex Zhuravlev [Fri, 24 Jun 2022 17:50:11 +0000 (20:50 +0300)]
LU-15963 osd-zfs: use contiguous chunk to grow blocksize

otherwise a sparse OST_WRITE can grow blocksize way too large.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I729775490f9a0c8262708931f321297af943f3c0
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/47768
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-17900 llite: handle AT_GETATTR_NOSEC flag if present 96/55296/3
Bruno Faccini [Mon, 3 Jun 2024 14:47:51 +0000 (16:47 +0200)]
LU-17900 llite: handle AT_GETATTR_NOSEC flag if present

Starting with v6.7-rc1-1-g8a924db2d7b5, a new AT_GETATTR_NOSEC
flag can be passed in addition by vfs_getattr_nosec() to the
underlying FS getattr() interface routine.
So it must be handled/masked in ll_vfs_getattr() in order to avoid
to pass it back to vfs_getattr(), like already done by
ecryptfs/overlayfs and thus no longer get a warning/stack displayed.

Signed-off-by: Bruno Faccini <bfaccini@nvidia.com>
Change-Id: I1d041913a6fc3ab9158fd611cb7d14dd1b7f694b
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55296
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Aurelien Degremont <adegremont@nvidia.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
5 months agoLU-6142 mgc: SPDX for management client 85/55285/2
Timothy Day [Sun, 2 Jun 2024 22:50:52 +0000 (22:50 +0000)]
LU-6142 mgc: SPDX for management client

Convert from verbose license text to SDPX.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I24de13d3c859710e439b880afd1c6024c2da8937
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55285
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-17675 tests: flush opencache in sanity-flr/61a 88/54788/5
Alex Zhuravlev [Mon, 15 Apr 2024 05:38:39 +0000 (08:38 +0300)]
LU-17675 tests: flush opencache in sanity-flr/61a

flush opencache to update MDS's atime with close RPC

Test-Parameters: trivial testlist=sanity-flr clientdistro=el9.3
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I5f4d3400b3f772553ee6004ac271a4aa644699e0
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54788
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-6142 llite: Fix style issues for llite_lib.c 40/54140/6
Arshad Hussain [Thu, 22 Feb 2024 06:23:20 +0000 (11:53 +0530)]
LU-6142 llite: Fix style issues for llite_lib.c

This patch fixes issues reported by checkpatch
for file lustre/llite/llite_lib.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I593c37a3dd19c9915c44e18033ce53dc965bbbda
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54140
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-17711 osd-ldiskfs: do not delete dotdot during rename 23/54723/10
Li Dongyang [Wed, 17 Apr 2024 05:36:55 +0000 (15:36 +1000)]
LU-17711 osd-ldiskfs: do not delete dotdot during rename

Since upstream kernel commit v5.12-rc4-32-g6c0912739699
ext4_dir_entry_2 after rec_len will be wiped when deleting
the entry.

This creates a problem with rename, when we delete dotdot
first and if it's a dx dir, kernel will wipe entire dx_root
in the block after dotdot entry.

We can just update the dotdot entry in-place without deleting.

For dx dirs, ext4_update_dotdot() takes care of dotdot and
inserting dotdot is an update, use it for linear dirs also.

Rewrite ext4_update_dotdot() to get a few fixes:
*use ext4_read_dirblock to get the first block.
*do not assert on data read from disk, we check the dot and
dotdot entry and if anything looks wrong, we return -EFSCORRUPTED.
*make sure the change is journalled.
*set metadata_csum correctly for dx dirs.

Update ext4-data-in-dirent.patch, if dotdot entry has no space
for dirdata, try to expand the dotdot entry by moving the
entries behind it, or move the dx_root for dx dirs.

Add conf-sanity/154 to verify that the ".." entry was updated
properly after restore, including with an htree split directory
with dx_root entry.

Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Change-Id: I33e862739fa44f583aaa4369190d6d80271db13b
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54723
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-16822 lnet: use proper Netlink flags for setup 29/55129/5
James Simmons [Fri, 31 May 2024 17:33:46 +0000 (11:33 -0600)]
LU-16822 lnet: use proper Netlink flags for setup

The Netlink flags sent to lnet_net_conf_cmd() were incorrect.
You can't use both NLM_F_EXCL and NLM_F_REPLACE together. If you
think about it these flags are opposites. Together this flags
also equal NLM_F_DUMP which the kernel doesn't support for this
operations so it failed with EOPNOTSUPP which tells user land
to use the old API so the failure wasn't easily detected.
We replace NLM_F_REPLACE with NLM_F_APPEND to avoid this
issue. Also for some reason lct_version gets stomped on
so we can't use it.

Fixes: ab6c8bd18e1 ("LU-16822 lnet: always initialize IPv6 at start up")
Test-Parameters: trivial testlist=sanity-lnet
Change-Id: I6b9eb013f6fc10276e91848d7b5f17d406fbbdb4
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55129
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
5 months agoLU-16822 tests: Setup IPv6 with fake network device 99/53599/3
James Simmons [Fri, 5 Jan 2024 14:00:24 +0000 (07:00 -0700)]
LU-16822 tests: Setup IPv6 with fake network device

Several of the LNet sanity test create a fake network device and
setup an IP. Only a IPv4 was setup so also setup a IPv6 address
to increase the testing coverage for large NID support.

Test-Parameters: trivial testlist=sanity-lnet
Change-Id: If29adf74f1fe6449ad3f48663c2872a39bf4664c
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53599
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
5 months agoLU-17871 ldlm: FLOCK ownlocks may be not set 84/55184/7
Andriy Skulysh [Wed, 3 Apr 2024 10:34:32 +0000 (13:34 +0300)]
LU-17871 ldlm: FLOCK ownlocks may be not set

Conflict checking loop should continue until ownlocks is set.
Ownlocks variable is essential for lock merges.

Change-Id: Ied526581dd7d4f100c95f2fe582d117a87a8a584
Fixes: b07a57027e (LU-15402 ldlm: speedup RD flock enqueue)
HPE-bug-id: LUS-12243
Signed-off-by: Andriy Skulysh <andriy.skulysh@hpe.com>
Reviewed-by: Vitaly Fertman <vitaly.fertman@hpe.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55184
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sergey Cheremencev <scherementsev@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-17881 build: fix _flavor definition 25/55225/2
Caleb Carlson [Tue, 28 May 2024 17:40:56 +0000 (11:40 -0600)]
LU-17881 build: fix _flavor definition

Allow user to override _flavor definition.
Move ordering of _kver definition to be fully
defined before using it to define _flavor.
This prevents _flavor from getting defined with
the kernel patch version field.

Signed-off-by: Caleb Carlson <caleb.carlson@hpe.com>
Test-Parameters: trivial
HPE-bug-id: LUS-12267
Change-Id: Ibd4db360d8c16f487453593cb0a9fd2a6a5a8c62
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55225
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-17877 lnet: export REGISTER_FUNC with EXPORT_SYMBOL_GPL 17/55217/3
Rebanta Mitra [Mon, 27 May 2024 07:57:13 +0000 (00:57 -0700)]
LU-17877 lnet: export REGISTER_FUNC with EXPORT_SYMBOL_GPL

This patch exports REGISTER_FUNC and UNREGISTER_FUNC
with EXPORT_SYMBOL_GPL to load GPL-licensed modules.

Test-Parameters: trivial

Change-Id: I3a0d4e2b27911af36e210692d28892590eb0371c
Signed-off-by: Rebanta Mitra <rmitra@nvidia.com>
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55217
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
5 months agoLU-17343 utils: added --path option for lctl list_param 02/55202/3
Frederick Dilger [Sat, 25 May 2024 23:23:20 +0000 (19:23 -0400)]
LU-17343 utils: added --path option for lctl list_param

Added 'lctl list_param [-p] PARAM' option that prints the
actual pathname(s) for PARAM instead of the parameter names(s).
This should allow users to "resolve" PARAM pathnames so that they
can be used directly, which avoids having to hard code them. Also
renamed "po_only_path" and "po_show_path" to be "po_only_name" and
"po_show_name" to avoid confusion with "po_only_pathname" for the new
option.

Signed-off-by: Frederick Dilger <fdilger@whamcloud.com>
Change-Id: I2259b930f3ac5cc46ac7a9a36218a44fa110157c
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55202
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-17699 utils: new --skip option for lfs find 00/55200/5
Frederick Dilger [Sat, 25 May 2024 21:46:08 +0000 (17:46 -0400)]
LU-17699 utils: new --skip option for lfs find

Added [--skip | -k] options for 'lfs find' to skip a percentage of all
files. This brings the benifit of allowing a certain percentage of the
files in the scanned directory to be migraated to new OSTs instead of
migrating entire directory trees. Note that the file size is not taken
into account when skipping files which could result in an unbalanced
set of files being returned. If there are fewer than 25 files there
could be an increased margin of error with the results, however it
should still be relatively negligable (at most 10%).

Planned to implement further utility with --skip-rebalance which would
calculate the pertentage of files that needed to be return vs. skipped
based on the fullness ratio of each OST vs. the avergae fullness of a
balanced filesystem to avoid the user having to calculate the skip
percentage themselves.

Signed-off-by: Frederick Dilger <fdilger@whamcloud.com>
Change-Id: I3ff1600f25f3be54f2a353fa78f7b8b7f98f591a
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55200
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-17873 test: ignore WIFSIGNALED if rc is 0 94/55194/3
Hongchao Zhang [Sat, 20 Apr 2024 17:53:11 +0000 (01:53 +0800)]
LU-17873 test: ignore WIFSIGNALED if rc is 0

Ignored the checking resulst of WIFSIGNALED if the return status
of the "lctl test_create" thread is zero.

Test-Parameters: trivial envdefinitions=SLOW=yes,DEBUG_SIZE=64 mdtcount=1 testlist=mds-survey,mds-survey,mds-survey,mds-survey,mds-survey,mds-survey
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Change-Id: Ifc3727d48010c9f00f38baff9ff91b5cc3afce5c
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55194
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
5 months agoLU-9859 libcfs: move crypto wrappers to lnet 87/55187/5
James Simmons [Fri, 24 May 2024 00:07:21 +0000 (20:07 -0400)]
LU-9859 libcfs: move crypto wrappers to lnet

The crypto wrappers in libcfs is one of the last item that is not
debugging related in the module. We can move it to LNet which
moves us closer to libcfs being just a debugging module.

Test-Parameters: trivial
Change-Id: Idbc058fe2cafc04e4300a576e3368c0961ce98a4
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55187
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-15644 llog: don't replace llog error with -ENOTDIR 51/55151/2
Mikhail Pershin [Sat, 18 May 2024 19:43:05 +0000 (22:43 +0300)]
LU-15644 llog: don't replace llog error with -ENOTDIR

The dt_try_as_dir() contains check for object existence
which is reported as -ENOTDIR after all. In case of llog
that goes to upper level and cause error reporting to
console. It is not relevant neither by error code nor by
debug level

Patch skips check for object existence in case of llog,
it is excessive anyway.
Debug level is reduced as well to don't spawn console
messages in case of -ENOENT, -ESTALE or -EIO errors

Fixes: 1ebc9ed460 ("LU-15902 obdclass: dt_try_as_dir() check dir exists")
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: Id404204566898a6ac2e258b7824491effc5fc92e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55151
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-17848 dt: cleanup dt_object.h header 26/55126/4
Timothy Day [Sat, 18 May 2024 16:54:37 +0000 (16:54 +0000)]
LU-17848 dt: cleanup dt_object.h header

Cleanup a number of LASSERT statements to unify style.

Use kernel doc style instead of the old Doxygen style. Avoid
using ** for comments that aren't kernel doc.

Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Ia23492534a05bce4850ca38ab7c06a07000504d3
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55126
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-16904 tests: Fix sanity test 248c and 360 when PFL layout is used 18/54918/3
Wei Liu [Thu, 25 Apr 2024 18:35:13 +0000 (11:35 -0700)]
LU-16904 tests: Fix sanity test 248c and 360 when PFL layout is used

For 248c, use stripe counct 1 for root before the readahead issue described
in LU-17755 and LU-15155 get fixed
For 360, use stripe count 1 for test dir to make sure the files created under
it have object greater than 1M on single OST to test delayed iput

Test-Parameters: trivial
Test-Parameters: testlist=sanity env=fs_STRIPEPARAMS="-E 1M -c1 -E eof" env=ONLY="248c,360"
Test-Parameters: testlist=sanity env=fs_STRIPEPARAMS="-E 64k -c 1 -E eof" env=ONLY="248c,360"
Test-Parameters: testlist=sanity env=fs_STRIPEPARAMS="-E 64k -c 1 -E eof -c 2" env=ONLY="248c,360"
Test-Parameters: testlist=sanity env=fs_STRIPEPARAMS="-E 64k -c 1 -E 1M -c 2 -E eof -c 4 -S 4M" env=ONLY="248c,360"

Signed-off-by: Wei Liu <sarah@whamcloud.com>
Change-Id: I93341001714c5d0942f2f8f2895ca8bb545dc344
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54918
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <patrick.farrell@oracle.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
5 months agoLU-17710 llite: protect parallel accesses to lli_*id 27/54727/5
Etienne AUJAMES [Wed, 10 Apr 2024 17:08:49 +0000 (19:08 +0200)]
LU-17710 llite: protect parallel accesses to lli_*id

OSC obtains process uid/gid/jobid from the ll_inode_info. This can be
racy if several processes access the same file. This can lead to
corrupted or incoherent set of values.

This patch replaced the fields lli_jobid/lli_uid/lli_gid by a common
"struct job_info lli_jobinfo" field.

struct job_info {
       char ji_jobid[LUSTRE_JOBID_SIZE];
       __u32 ji_uid;
       __u32 ji_gid;
};

The accesses are protected by a seqlock (lli_jobinfo_seqlock).

Additionally, this saves and restores process uid/gid values for
readahead works (cra_jobid is replaced by cra_jobinfo).

Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Change-Id: Idf01c1e4b533aea405c3a4439c0df0fcfc4dea56
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54727
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Thomas Bertschinger <bertschinger@lanl.gov>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-17710 obdclass: background jobid garbage collection 26/54726/3
Etienne AUJAMES [Wed, 10 Apr 2024 12:16:41 +0000 (14:16 +0200)]
LU-17710 obdclass: background jobid garbage collection

The jobid pidmap garbage collection is done directly in
lustre_get_jobid()/jobid_get_from_cache() every 5 min.

This patch run the garbage collection in background with a "delayed
work" handler.

Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Change-Id: I5719e278ec6bde0f8c15fd2e3fe9757c714747c4
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54726
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Thomas Bertschinger <bertschinger@lanl.gov>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-17718 obdclass: potential string overflow upcall_cache.c 10/54710/6
Sebastien Buisson [Tue, 9 Apr 2024 13:00:41 +0000 (15:00 +0200)]
LU-17718 obdclass: potential string overflow upcall_cache.c

Use strncpy() in upcall_cache_set_upcall() to quiet Coverity warning.
And reorganize the function so that the code flow is more linear in
the success case.

CoverityID: 424705: ("String overflow")

Fixes: 2153e86541 ("LU-17497 obdclass: check upcall incorrect values")
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I1aee77f78c92c6c571dfe358435a2733cc3ba9d9
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54710
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Aurelien Degremont <adegremont@nvidia.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
5 months agoLU-7665 test: improve sanity 300p 25/54625/2
Lai Siyao [Mon, 18 Mar 2024 05:13:14 +0000 (01:13 -0400)]
LU-7665 test: improve sanity 300p

Sanity test 300p set OBD_FAIL_OUT_ENOSPC once, but it may fail llog
operation (not critical), therefore subsequent mkdir succeeds. Change
the fail_loc to always fail so the test can be more robust.

Test-Parameters: trivial
Test-Parameters: mdscount=2 mdtcount=4 testlist=sanity
Test-Parameters: mdscount=2 mdtcount=4 testlist=sanity
Test-Parameters: mdscount=2 mdtcount=4 testlist=sanity
Test-Parameters: mdscount=2 mdtcount=4 testlist=sanity
Test-Parameters: mdscount=2 mdtcount=4 testlist=sanity
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I128ce39aaf97e1785a8c135a696d0b404b48a2a8
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54625
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-11085 ldlm: optimise extent locks with identical extent 87/54587/17
Mr NeilBrown [Thu, 28 Mar 2024 23:15:55 +0000 (19:15 -0400)]
LU-11085 ldlm: optimise extent locks with identical extent

Many locks with identical extent is (apparently) common.  Rather than
putting all of these locks in the extent tree, possibly making it much
bigger than needed, link them all together with only one in the extent
tree.

When removing the one in the extent tree, if there are others, one
of those must be placed in the tree where the original was.
extent_replace() does this. It could be in generic code.

A new extent_insert_unique() is added.  Ideally this would be provided
by the standard interval_tree code.

As extent_insert() is now not used, INTERVAL_TREE_DEFINE is told to
make all functions 'static inline' so we don't get warnings about the
unused function.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I9f8433514f8451abc80bbb6050499599e0f93520
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54587
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
5 months agoLU-17462 build: make some deb packages optional 88/53788/10
Shaun Tancheff [Sat, 3 Feb 2024 07:21:03 +0000 (14:21 +0700)]
LU-17462 build: make some deb packages optional

make building the utils, tests and iokit packages optional.

Also mpi is option in the --disable-mpitests

If --disable-mpitests or --disable-tests are disable the mpi
package dependancies should also be dropped.

Test-Parameters: trivial
HPE-bug-id: LUS-12091
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Icd232571f7052ec0a4b25c32ff573c3b5f76de21
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53788
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Caleb Carlson <caleb.carlson@hpe.com>
5 months agoLU-17233 dkms: support for kfilnd and gnilnd 56/52856/11
Shaun Tancheff [Sat, 25 May 2024 17:12:10 +0000 (11:12 -0600)]
LU-17233 dkms: support for kfilnd and gnilnd

dkms should try to build build kkfilnd if kfi support is
detected. Similarly kgnilnd can be built if perquisites
can be found.

Test-Parameters: trivial
HPE-bug-id: LUS-12070, LUS-11893, LUS-11902
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Ie4b0957e7a0eda4f25ae96a12619baae6d6d170a
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52856
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Caleb Carlson <caleb.carlson@hpe.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-17509 dkms: SUSE15 depends on unavailable libmount 45/53945/6
Shaun Tancheff [Wed, 7 Feb 2024 09:53:13 +0000 (16:53 +0700)]
LU-17509 dkms: SUSE15 depends on unavailable libmount

SUSE renamed libmount in to libmount1 however the requiement
is statisifed by libmount-devel which in turn requires the
appropriate libmount package.

Drop the explicit libmount requirement from lustre-dkms.spec

Test-Parameters: trivial
HPE-bug-id: LUS-12141
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Ifdee172483b73f9f66eb97883851febf94134309
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53945
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
5 months agoLU-17461 dkms: improve /etc/sysconfig/lustre 87/53787/8
Shaun Tancheff [Tue, 6 Feb 2024 18:48:44 +0000 (01:48 +0700)]
LU-17461 dkms: improve /etc/sysconfig/lustre

Expand the features available in /etc/sysconfig/lustre
to enable more flexability to dkms users.

Providing y/n switches for common features:
    LUSTRE_DKMS_ENABLE_GSS=y/n
    LUSTRE_DKMS_ENABLE_GSS_KEYRING=y/n
    LUSTRE_DKMS_ENABLE_CRYPTO=y/n
    LUSTRE_DKMS_ENABLE_IOKIT=y/n

As well as a catch-all to pass to configure:
    LUSTRE_DKMS_CONFIGURE_EXTRA='string passed to configure'

Add suport for dpkg checking for libkrb5-dev to enable or
disable gss by default, if it is not otherwise specifed.

Test-Parameters: trivial
HPE-bug-id: LUS-12097
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Id8dd17c867d9aeb1ec27632729433ba128dcfd0a
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53787
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Caleb Carlson <caleb.carlson@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
5 months agoLU-13881 pcc: comparator support for PCC rules 85/39585/20
Qian Yingjin [Thu, 6 Aug 2020 08:29:21 +0000 (16:29 +0800)]
LU-13881 pcc: comparator support for PCC rules

There are increasing requirements for PCC rules to add comparator
support:
- File data larger or smaller than certain threshold should not
  auto cache in PCC (i.e. larger than the capacity of PCC backend
  on a client).
- Users can specify a range of UID/GID/ProjID for auto caching on
  PCC when define a rule;

In addition to the original equal (=) operator, this patch also
adds greater than (>) and less than (<) comparison operators.

The following rule expressions are supported:
- "projid={100}&size>{1M}&size<{500G}"
- "projid>{100}&projid<{110}"
- "uid<{1500}&uid>{1000}"

EX-2872 pcc: mtime rule for 'lctl pcc add'

Add an "mtime>N" rule to allow skipping files for PCC-RO auto-attach
if they were created or modified more than N seconds ago.  Otherwise,
it may be that files are added to the PCC cache before they finished
writing, or if they will be modified again quickly after creation.
Was-Change-Id: Ibb99bff5b483717ae6e5b83f82f1bcd86c3ebbe5

This patch disabled sanity-pcc/test_33 on rhel9.3 kernel until the
inconsistent LSOM problem is solved.

EX-bug-id: EX-2872
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I9f024eb6903f5652ba3cf04fa289456803493b2c
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/39585
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-12373 pcc: uncache the pcc copies when remove a PCC backend 52/38352/25
Qian Yingjin [Fri, 14 Jun 2019 09:29:55 +0000 (05:29 -0400)]
LU-12373 pcc: uncache the pcc copies when remove a PCC backend

Currently when remove a PCC backend from a client, it does not
make any special handling for previously cached files at all.
Users can still use PCC caching service for these files. This
may not what users want. The reason is as follows:

1) For RW-PCC cached files, it does not restore the data back
into Lustre OSTs of the main filesystem. Although the PCC
backend falls back as a tranditional HSM storage solution
since the lhsmtool_posix copytool is still running at this
client. But this is dangerous, and likly to cause user data
to be lost if the PCC device may be permanently unavailable.

2) The space used by these PCC cached files may not released.

In this patch, when remove a PCC backend from a client, the
default action is to scan the PCC backend fs, uncache
(detach and remove) the PCC copy from PCC by FID.

We also add an option "--keep|-k" for PCC backend removal.
It behaves as before, just remove the PCC backend, but
retain the data on the cache.

This patch also introduces a common library to scan the HSM
backend.

EX-2579 pcc: support a flatter HSM archive format

Add versioning (v1 and V2) to the HSM (PCC) archive format (directory
layout):
v1: (oid & 0xffff)/-/-/-/-/-/FID
v2: ((oid ^ seq) & 0xffff)/FID

v1 is the original layout and the default. v2 is the new layout which
should be selected for new installs.
Was-Change-Id: If660f3cf4c02469bb23e65a44f86f0346367adf6

LU-12373 pcc: delete stale PCC copy when remove PCC backend

By default, when removing a PCC backend from a client, the action
is to scan the PCC backend FS, uncache (detach and remove) all
scanned PCC copies from PCC by FIDs.

However, during the tests, we found that some old stale PCC copies
are not removed when an adminstrator runs "lctl pcc del|clear".
The reason is that these PCC copies are already detached from PCC
when running the commands.

This patch fixes this bug: when removing a PCC backend from a
client, it will also delete all non-cached PCC copies from PCC
backend to free up the space.
Was-Change-Id: Id829abe7e6cb1294e6baea76452f4a9178711451

EX-bug-id: EX-2579
Test-Parameters: clientcount=3 testlist=sanity-pcc,sanity-pcc,sanity-pcc
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: Ib4db36137c025fd78c7022c8b8c39b63e3b9ad4d
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/38352
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
5 months agoLU-10918 pcc: auto RO-PCC caching when O_RDONLY open files 46/38346/33
Qian Yingjin [Wed, 22 Aug 2018 13:19:48 +0000 (21:19 +0800)]
LU-10918 pcc: auto RO-PCC caching when O_RDONLY open files

During the file open() operation, if the file is being opened with
O_RDONLY flags, and the file matches the predefined rule, it will
be prefetched and attached into RO-PCC automatically.

Test-Parameters: clientcount=3 testlist=sanity-pcc,sanity-pcc,sanity-pcc
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: Ib2c2ab51d67aed84eb7676c8df191faa33dfad39
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/38346
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
5 months agoLU-10499 pcc: test interoperability with PCC-RO 86/54386/14
Qian Yingjin [Thu, 25 Mar 2021 02:44:16 +0000 (10:44 +0800)]
LU-10499 pcc: test interoperability with PCC-RO

For Lustre 2.15.0 servers, it fails many of subtests that are
PCC-RO specific.
In this patch, each subtest related to PCC-RO adds an connect
flag check and skip it when run against old servers without
PCC-RO support.

EX-4006 pcc: make "pccro=1" default

To avoid a risk that users will accidentally configure PCC-RW and
potentially lose data if those client nodes go offline, this patch
makes "pccro=1" default for PCC backends.

This patch adds a new option "--w|--write" for PCC-RW cache
mode when attach a file.
Also It makes "--r|--readonly" as a default option for PCC attach
command.
Was-Change-Id: I56735b0ebe8f0d9ef22b3f7e39e8cccfa3aad443

EX-8739 tests: skip sanity-pcc tests on el9.3

Skip sanity-pcc test_6, test_7a/7b, test_23, test_35 on RHEL9.3
clients due to continuous failures with PCC-RW, which is unused.

Skip sanity-pcc test_102 due to el9.3 fio io_uring bug.
Was-Change-Id: I76cbd0342788fff8b0167c0656e941f96d73fc48

EX-bug-id: EX-2860 EX-4006 EX-8739
Test-Parameters: clientdistro=el9.3 serverversion=EXA6 serverdistro=el8.8 testlist=sanity-pcc
Test-Parameters: clientdistro=el8.9 serverversion=EXA6 serverdistro=el8.8 testlist=sanity-pcc
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: Ie4fc41b2dc51a038027009fbcc6e86f9d61cd54f
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54386
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-17743 o2iblnd: fix privileged port check in passive_connect 66/55266/2
Serguei Smirnov [Thu, 30 May 2024 18:14:09 +0000 (11:14 -0700)]
LU-17743 o2iblnd: fix privileged port check in passive_connect

Check that the port is in "privileged" range only if
kib_require_priv_port is set

Test-Parameters: trivial testlist=sanity-lnet
Fixes: 9b18afa ("LU-17743 ko2iblnd: move to struct lnet_nid")
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: I3ed9c174d983be68aecc4b8e12aaae7c096d26e8
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55266
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
5 months agoLU-17250 mgs: fix resource leak in name_create_osp 38/55238/2
Etienne AUJAMES [Wed, 29 May 2024 19:32:27 +0000 (21:32 +0200)]
LU-17250 mgs: fix resource leak in name_create_osp

This patch fixes a resource leak detected by Coverity:

CID 425355:    (RESOURCE_LEAK)
/lustre/mgs/mgs_llog.c: 189 in name_create_osp()

Fixes: d4682ff ("LU-17250 mgs: generate a new MDT configuration by copy")
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Change-Id: I8e0cbc3507e5a9882b2cfadfd68aea318575fc7a
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55238
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-17869 llapi: Fixed function header comments 80/55180/3
Rajeev Mishra [Wed, 22 May 2024 23:41:34 +0000 (23:41 +0000)]
LU-17869 llapi: Fixed function header comments

Updated input parameter descriptions for
`llapi_layout_v2_sanity` and `llapi_layout_sanity` functions.

Test-parameters: trivial
Fixes: ee7dfc5ad1 ("LU-17025 llapi: Verify stripe pool name")
Signed-off-by: Rajeev Mishra <rajeevm@hpe.com>
Change-Id: I72f4973d8be70ad60d088ea0e18d1e961f01cd50
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55180
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
5 months agoLU-15781 ldiskfs: support 5.15.0-106+ ubuntu kernels 78/55078/3
James Simmons [Wed, 15 May 2024 14:22:33 +0000 (10:22 -0400)]
LU-15781 ldiskfs: support 5.15.0-106+ ubuntu kernels

Starting with 5.15.0-106 kernels the ext4-prealloc patch no
long applies. Update ext4-prealloc.patch so it can build
again.

Test-Parameters: trivial
Change-Id: I958c64842c5e1dc8b974e8a188fa18541d458ab5
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55078
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-17356 quota: fix qmt_pool_new_conn 19/55019/3
Sergey Cheremencev [Mon, 6 May 2024 13:05:36 +0000 (16:05 +0300)]
LU-17356 quota: fix qmt_pool_new_conn

Wrong argument passed into qmt_dom from
qmt_pool_new_conn caused a panic:

  qmt_sarr_get_idx()) ASSERTION( arr_idx <
    qpi->qpi_sarr.osts.op_count && arr_idx >= 0 )
    failed: idx invalid 0 op_count 0

Add conf-sanity_33d that reproduces above
assertion without the fix.

Fixes: 67f90e4288 ("LU-17034 quota: lqeg_arr memmory corruption")
Signed-off-by: Sergey Cheremencev <scherementsev@ddn.com>
Change-Id: I48801f1fb7e69097cbfbe083f1d31a4639d4bf4d
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55019
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-17653 sec: avoid unlocking unlocked page 34/54734/6
Shaun Tancheff [Thu, 11 Apr 2024 14:03:30 +0000 (22:03 +0800)]
LU-17653 sec: avoid unlocking unlocked page

If a page is unlocked by @cl_2queue_disown after explictly write the
newly modified page, the following page_unlock must not be performed.

Track the page locked state and do not unlock pages which
are not locked in ll_io_zero_page()

Fixes: adf46db962 ("LU-12275 sec: support truncate for encrypted file")
Test-Parameters: testlist=sanity-sec clientdistro=el9.3 env=ONLY=48a,ONLY_REPEAT=10
Test-Parameters: testlist=sanity-sec clientdistro=el9.3
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I6e121920c7e86e4d0004def77b0ce066ae2ba81a
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54734
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Patrick Farrell <patrick.farrell@oracle.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-10003 lnet: migrate fail nid to Netlink 51/55051/5
James Simmons [Thu, 23 May 2024 21:25:30 +0000 (17:25 -0400)]
LU-10003 lnet: migrate fail nid to Netlink

We have the ability to make peers fail when they reach a specific
threshold using an ioctl that currently only uses small NIDs.
Move to Netlink to be able to use large NIDs. Also the Netlink
code is written to support more than one peer at a time even if
the original user land tool only supports setting one peer at a
time.

Test-Parameters: trivial testlist=sanity-lnet
Change-Id: I8e5b38fcb582624530d208fac731183488662138
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55051
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-17450 test: disable test 56x 56xa 56xb in sanity 62/54962/3
Hongchao Zhang [Mon, 15 Apr 2024 19:48:38 +0000 (03:48 +0800)]
LU-17450 test: disable test 56x 56xa 56xb in sanity

Add the interop tests 56x, 56xa, 56xb into always_except before
the patch https://review.whamcloud.com/53997 in LU-17525 is landed.

Test-Parameters: trivial testlist=sanity
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Change-Id: I99fa7be9dc7f50113d463aea4b321502b31d7348
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54962
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-16025 llite: allow unaligned DIO reaching EOF 18/54718/10
Bobi Jam [Wed, 10 Apr 2024 09:19:53 +0000 (17:19 +0800)]
LU-16025 llite: allow unaligned DIO reaching EOF

Direct IO requires file offset and iov_iter count be page aligned, if
server does not support unaligned DIO.

For old servers, they do not have OBD_CONNECT2_UNALIGNED_DIO support,
and be deemed as not supporting unaligned DIO.

Since mirror resync would use direct IO to read data from a mirror,
and if the file size is not page aligned, the last read iov_iter
would be truncated by commit 4468f6c9d9 and would contain unaligned
iov_iter count, so it would fail with old servers.

This patch fixes this interop issue by allowing unaligned DIO
reaching the end of the file.

Test-Parameters: testlist=sanity-sec serverversion=EXA6
Fixes: 7194eb6431 ("LU-13805 clio: bounce buffer for unaligned DIO")
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I229e193c3f0df0c21284991809573e312d18a556
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54718
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Patrick Farrell <patrick.farrell@oracle.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-17676 build: configure should prefer to ask if 71/54571/2
Shaun Tancheff [Tue, 26 Mar 2024 08:04:24 +0000 (15:04 +0700)]
LU-17676 build: configure should prefer to ask if

In general configure messages should ask 'if <something>'
as configure is asking the question and trying to automatically
determine the answer.

If most cases prefer 'if <something>'

This updates configure messages to ask if ...

Test-Parameters: trivial
HPE-bug-id: LUS-12117
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I11a42583faf2f88194c93a9aeea3b64f0d95f0eb
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54571
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-17836 build: allow builds without libpthread 62/55062/2
Shaun Tancheff [Thu, 9 May 2024 10:10:54 +0000 (17:10 +0700)]
LU-17836 build: allow builds without libpthread

Configure currently allows for --disable-libpthread it is not
frequently used but may be needed for some users.

Test-Parameters: trivial
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I32049bab8e0f278b4c80fe37839c8c90c45d4c74
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55062
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-16327 build: Ubuntu jammy 5.19 server support 25/50225/18
Shaun Tancheff [Sat, 2 Mar 2024 13:08:15 +0000 (20:08 +0700)]
LU-16327 build: Ubuntu jammy 5.19 server support

Ubuntu 5.19 server ldiskfs series is close to the
with mainline LTS 6.1.38 kernel.

Updated for Jammy 5.19.0-46-generic kernel
   ext4-mballoc-extra-checks.patch
   ext4-prealloc.patch

Tested with Ubuntu-hwe-5.19-5.19.0-46.47_22.04.1
Ubuntu Jammy 5.19.0-46-generic kernel

Test-Parameters: trivial
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Iff2f3b29a7cf4778abb69505143ca2ea32022edf
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50225
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
5 months agoLU-15496 tests: fix sanity/398c to use proper OSC name 32/55132/4
Andreas Dilger [Thu, 16 May 2024 19:57:42 +0000 (21:57 +0200)]
LU-15496 tests: fix sanity/398c to use proper OSC name

For ppc64le and aarch64 clients, the OSC import instance name does
not have "ffff" at the start, so use the proper device name for this
subtest.

Clean up the rest of test_398c to meet modern test code style.

Test-Parameters: trivial testlist=sanity env=ONLY=398c clientarch=ppc64le clientdistro=el8.8
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: If8c72fa9b13eace009f39daf82454221eba6761b
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55132
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Alex Deiter
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-17844 target: remove some LCONSOLE_ERROR_MSG() 13/55113/2
Timothy Day [Wed, 15 May 2024 03:41:59 +0000 (03:41 +0000)]
LU-17844 target: remove some LCONSOLE_ERROR_MSG()

Replace LCONSOLE_ERROR_MSG() with LCONSOLE_ERROR().

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I8de0221e29c8ec70759eea38a67001f283f6fe39
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55113
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-17848 osd: deduplicate osd_fid_init()/fini()/alloc() 10/55110/6
Timothy Day [Tue, 14 May 2024 20:59:16 +0000 (20:59 +0000)]
LU-17848 osd: deduplicate osd_fid_init()/fini()/alloc()

These functions are identical in the two OSD implementations. This
can be moved to lustre/fid/ and made generic.

These functions are forced to live in fid.ko rather than obdclass.ko
due to module dependency issues.

Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: If97ca5615d9bdfe0fe9886686e9ce3ec2b740f7d
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55110
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-17850 build: prefer LINUXRELEASE over uname -r 08/55108/2
Jian Yu [Tue, 14 May 2024 17:50:20 +0000 (10:50 -0700)]
LU-17850 build: prefer LINUXRELEASE over uname -r

In a container or chroot environment, "uname -r" reports
the host instead of the target kernel version. We should
use the LINUXRELEASE variable which is configured in
config/lustre-build-linux.m4 with the value from UTS_RELEASE.

Change-Id: Iaa48027f5ae873e1298695a264db1c351d9eac5c
Test-Parameters: trivial
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55108
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: ake sandgren <ake.sandgren@hpc2n.umu.se>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-17848 osd: purge key_rec() from dt API 99/55099/2
Timothy Day [Tue, 14 May 2024 02:19:28 +0000 (02:19 +0000)]
LU-17848 osd: purge key_rec() from dt API

This is a pointless function pointer field that has
spawned a number of pointless function implementations.
Even the documentation has no idea why this exists.

Remove everything to do with key_rec().

Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I7f84853a3fa285bf2ac53661b30384d099be1b91
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55099
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-17844 lnet: remove a few LCONSOLE_ERROR_MSG() in api-ni.c 98/55098/2
Timothy Day [Tue, 14 May 2024 01:23:34 +0000 (01:23 +0000)]
LU-17844 lnet: remove a few LCONSOLE_ERROR_MSG() in api-ni.c

These magic numbers aren't so magical anymore. Just
use LCONSOLE_ERROR().

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I629ef2ceaa51dc1422d87dc056de2c46079438c0
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55098
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>