Whamcloud - gitweb
fs/lustre-release.git
3 months agoLU-16424 tests: Add version check in sanity-lnet b2_15-next
Wei Liu [Fri, 22 Nov 2024 20:29:53 +0000 (12:29 -0800)]
LU-16424 tests: Add version check in sanity-lnet

Skip sanity-lnet test_205, test_207 and test_209 if
version is older than 2.14.58 since the lnet_if_list
function was added in Fixes:
3166a201e0 ("LU-15398 tests: Use remote peers for health tests")

Lustre-change: https://review.whamcloud.com/51942
Lustre-commit: ee4f470d590dd19d9c7d188958d9305ccd666e5e

Test-Parameters: trivial testlist=sanity-lnet \
serverjob=lustre-b2_14 serverbuildno=2 \
serverdistro=el8.3

Signed-off-by: Wei Liu <sarah@whamcloud.com>
Change-Id: I9cd62d91980784e3b33cf4e30426bf74d17f717f

3 months agoLU-15846 tests: don't use comma-separated debug flags
Andreas Dilger [Thu, 21 Nov 2024 20:15:54 +0000 (12:15 -0800)]
LU-15846 tests: don't use comma-separated debug flags

To avoid test interop issues between 2.15 clients and 2.12/2.14
servers, don't use comma-separated debug flags in sanity-quota.sh
quota_init() and quota_fini().

Lustre-change: https://review.whamcloud.com/47308
Lustre-commit: fe8315c25ed093d77a6f366e2a4849aba008b680

Test-Parameters: trivial testlist=sanity-quota env=ONLY=48 serverversion=2.14.0
Fixes: 6b6fde1026 ("LU-13055 libcfs: allow comma-separated masks")
Fixes: 78be823f33 ("LU-15218 quota: delete unused quota ID")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ifca39054d14292bca8bcff9b8e03ae58fd5cc3a8

3 months agoLU-18356 tests: allow server to specify except list
Andreas Dilger [Wed, 6 Nov 2024 04:00:06 +0000 (21:00 -0700)]
LU-18356 tests: allow server to specify except list

Allow the installed server code to specify a lists of subtests that
should be excluded by older clients when running a particular test
script.  This allows older clients to skip tests that they would
otherwise run from their local test script, but that do not work due
to server changes.

The files for each test script are read from the mds1 and ost1 facets.
The filename(s) under lustre/tests/except/ should start with the base
test script name (e.g. sanity), followed by '.', an optional unique
string to avoid conflicts between patches, and end with ".ex".
For example, sanity.ex, sanity.test_142.ex, sanity.acl.ex are valid
"sanity.sh" except filenames, but sanity-acl.ex is not.

Lines starting with '#' are comments and ignored.  Otherwise, lines
should have whitespace-separated fields on each line, as shown in the
examples below.

  #facet op need_version             jira     space_separated_subtests
  mds1    < v2_14_55-100-g8a84c7f9c7 LU-14927 0f
  linux   < 5.12.0                   LU-18102 27J
  client  == OST1_VERSION            LU-13081 151 156

The facet may be "client", "mds1", "ost1", or "linux" (client), and
"need_version" can be any Lustre (or Linux) version number or another
version name like OST1_VERSION, MDS1_VERSION, or CLIENT_VERSION.
The "op" can be standard math/logic comparisons ">=", "<", "!=", etc.

The version comparison is handled like the below pseudo-code:

  ${FACET}_VERSION $op $need_version OR except $subtests

In other words, the version check must be true or the subtest(s) will
not be run.  Checks within a single file should be ordered by subtest
number to make it easier to see whether some subtest is being skipped.

Lustre-change: https://review.whamcloud.com/56901
Lustre-commit: c4c3a7350b55ace0c38123c4b820c713f42e1cb7

Fix tests using "sh TESTSCRIPT.sh" instead of "bash TESTSCRIPT.sh"
to start a script that is calling test-framework.sh, since that
would now run afoul of the bashism that is added in this patch.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I0216d9980147ce3409807e9d7f9759fe533ebbe5

3 months agoLU-15553 test: mkdir_on_mdt0 in replay-vbr.sh
Lai Siyao [Fri, 8 Nov 2024 19:51:53 +0000 (11:51 -0800)]
LU-15553 test: mkdir_on_mdt0 in replay-vbr.sh

Change mkdir to mkdir_on_mdt0 in several replay-vbr.sh sub tests.

Lustre-change: https://review.whamcloud.com/56540
Lustre-commit: f6c733c422eae64cea93c33fb14e6adb2eed81d0

Fixes: b9c4dc3c33 ("LU-14792 llite: enable filesystem-wide default LMV")
Test-Parameters: trivial testlist=replay-vbr mdtcount=4
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I7457c155bbadb86adf8272113a4e4202b98c20a5
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
3 months agoLU-18214 ldlm: change flock deadlock detection
Yang Sheng [Thu, 14 Nov 2024 20:54:22 +0000 (12:54 -0800)]
LU-18214 ldlm: change flock deadlock detection

The flock deadlock detection code thought request lock
same as blocking lock is a bug. In fact, this is a case
of cycling chain. So we should treat it as a deadlock
case. Also clean up the reprocess code.

Lustre-change: https://review.whamcloud.com/56319
Lustre-commit: TBD (from 1fd956b9467c06b8f861db1706fc13cae9bb6fc8)

Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: Icf0df4ac281c2cdb6cc57cb79db137d39ecef9e6

3 months agoLU-17070 lov: retry layout refresh if got old layouts
Bobi Jam [Tue, 12 Nov 2024 20:09:12 +0000 (12:09 -0800)]
LU-17070 lov: retry layout refresh if got old layouts

lov_layout_change() would not apply old layouts which can get through
when MDS doesn't take layout lock, this patch would retry getting
the layout and re-apply the layout again for once.

Lustre-change: https://review.whamcloud.com/55061
Lustre-commit: 7974e41a26c22181be2818b3580756fa559d14d9

Fixes: 13557aa869 ("LU-15300 mdt: refresh LOVEA with LL granted")
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Id29ec4ada85060a20f730f92a6a9409d755a56a1

3 months agoLU-16770 llite: prune object without layout lock first
Andriy Skulysh [Tue, 12 Nov 2024 20:03:42 +0000 (12:03 -0800)]
LU-16770 llite: prune object without layout lock first

lov_layout_change() calls cl_object_prune() before
changing layout. It may lead to eviction from MDT
in case slow responce from OST.

To reduce risk of possible eviction call cl_object_prune()
without layout lock held before calling lov_layout_change()

vvp_prune() attempts to sync and truncate page cache pages.
osc_page_delete() may encounter page cache pages in non-clean state
during truncate because there's a race window between sync and truncate.
Writes may stick into this window and generate dirty or writeback pages.

This window is usually protected with a special truncate semaphore e.g.
when truncate is requested from the truncate syscall.

Let's use this semaphore to avoid write vs truncate race in vvp_prune().

Lustre-change: https://review.whamcloud.com/50742
Lustre-commit: 9c453ba6d9a0152aa75e92b8372d54a758a10b18

HPE-bug-id: LUS-9927, LUS-11612
Signed-off-by: Andriy Skulysh <andriy.skulysh@hpe.com>
Change-Id: Ie2ee29ea1e792e1b34b6de068ff2b84fd8f52f2a

3 months agoLU-15300 mdt: refresh LOVEA with LL granted
Alex Zhuravlev [Tue, 12 Nov 2024 19:23:08 +0000 (11:23 -0800)]
LU-15300 mdt: refresh LOVEA with LL granted

this change tries to fix two problems:
1) mdt_reint_open() fetches LOVEA before layout lock is taken.
   this may race with another process changing the layout and
   may result in a stale layout returned with a granted layout
   lock - re-fetch LOVEA once layout lock is granted

2) lov_layout_change() should not apply old layouts which
   can get through when MDS doesn't take layout lock

3) LFSCK shouldn't ignore layout version stored on MDS to avoid
   a situation when version degrades compared to client's copy.

This patch misses an optimization and can result in a number of
useless calls to OSD to fetch LOVEA. To be fixed in a followup
patch.

Lustre-change: https://review.whamcloud.com/46413
Lustre-commit: 13557aa86904376e48a5e43256d5c1ab32c1c2d6

LU-14869 test: improve sanity-flr/200a

Make sure "flock -x" successfully returned before running mirror
resync so that it won't get into running read holding shared flock.

Lustre-change: https://review.whamcloud.com/54345
Lustre-commit: 2bf51212680b3d4117925965c368d53587bf37d4

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Idee1101d152ab09947faf6d75574a8761a7690a5

3 months agoLU-18085 llite: use RCU to protect the dentry_data
Yang Sheng [Thu, 14 Nov 2024 02:25:17 +0000 (18:25 -0800)]
LU-18085 llite: use RCU to protect the dentry_data

The upstream has changed the rule of dentry kill since
v6.7-rc1-20-g1c18edd1b7a0. The d_release callback will
be invoked before the dentry was removed from children
list. This means the changes of d_fsdata could be seen
for others. We have already used call_rcu to handle the
release. So just apply RCU in read side to ensure access
safety.

Lustre-change: https://review.whamcloud.com/55984
Lustre-commit: 983999bda71115595df48d614ca1aaf9b746c75f

Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: I58713bfbf22749d6c0a5e40f710549662248e32f

3 months agoLU-17613 tests: explicit check for eviction with dmesg parse
Vladimir Saveliev [Tue, 30 Jul 2024 15:21:43 +0000 (18:21 +0300)]
LU-17613 tests: explicit check for eviction with dmesg parse

client_evicted() used to check for client eviction based on result of
lfs df. When it returned any error but EOPNOTSUPP - that was taken as
"client was evicted".

When glibc's realpath() changed to not call stat()
(see for ref
  stdlib: Sync canonicalize with gnulib [BZ #10635] [BZ #26592] [BZ
  ..
  - Realpath mishandles EOVERFLOW; stat not needed anyway (BZ#24970).
)
'lfs df' started to return EOPNOTSUPP from lfs_df(). client_evicted()
was changed, now any non-zero return is taken as client was evicted.

Check for "This client was evicted" in dmesg output to make sure that
eviction happened.

Add a comment in ptlrpc_import_recovery_state_machine() to make it
clear that this specific error message is used by the test code. Avoid
ratelimiting for the message.

Lustre-change: https://review.whamcloud.com/54299
Lustre-commit: ab5a2b63fb90b75ef07d25b347423e2db05286ef

Fixes: a5a9ded43b ("LU-16916 tests: fix client_evicted() not to ignore EOPNOTSUPP")
Test-Parameters: trivial testlist=replay-vbr,recovery-small
HPE-bug-id: LUS-11742
Signed-off-by: Vladimir Saveliev <vladimir.saveliev@hpe.com>
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I10ef99d23d630164bfdf167e54e2f177e9b85598

3 months agoLU-18387 kernel: update RHEL 9.5 [5.14.0-503.14.1.el9_5]
Jian Yu [Mon, 18 Nov 2024 20:18:19 +0000 (12:18 -0800)]
LU-18387 kernel: update RHEL 9.5 [5.14.0-503.14.1.el9_5]

Update RHEL 9.5 kernel to 5.14.0-503.14.1.el9_5 for Lustre client.

Lustre-change: https://review.whamcloud.com/57029
Lustre-commit: TBD (from 1879bf37e4360b46feddf9a01b531b8226b6befa)

Test-Parameters: trivial \
  mdtcount=4 mdscount=2 clientdistro=el9.5 testlist=sanity
Test-Parameters: optional clientdistro=el9.5 testgroup=full-part-1
Test-Parameters: optional clientdistro=el9.5 testgroup=full-part-2
Test-Parameters: optional clientdistro=el9.5 testgroup=full-part-3

Change-Id: I47b80f5fec166220ac25563460a3b0f4fbd2e6bb
Signed-off-by: Jian Yu <yujian@whamcloud.com>
3 months agoLU-xxxx - disable DOM in racer
Oleg Drokin [Tue, 23 Oct 2018 05:51:18 +0000 (01:51 -0400)]
LU-xxxx - disable DOM in racer

this is another source of timeouts?

3 months agoRestore memory pressure
Oleg Drokin [Wed, 17 Oct 2012 02:10:24 +0000 (22:10 -0400)]
Restore memory pressure

5 months agoNew release 2.15.6 2.15.6 v2_15_6
Andreas Dilger [Wed, 27 Nov 2024 21:38:43 +0000 (14:38 -0700)]
New release 2.15.6

Signed-off-by: Andreas Dilger <adilger@dilger.ca>
Change-Id: Ibfd72654a39441de5f7d9aadeae05eef3f500c1e

5 months agoNew RC 2.15.6-RC1 2.15.6-RC1 v2_15_6-RC1
Oleg Drokin [Mon, 18 Nov 2024 17:46:15 +0000 (12:46 -0500)]
New RC 2.15.6-RC1

Change-Id: Ib3857268cee9d89bd1fa2212e6ef53d45cf55513
Signed-off-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-18435 lod: recover layout generation from replay 89/56989/4
Alex Zhuravlev [Tue, 12 Nov 2024 20:23:03 +0000 (12:23 -0800)]
LU-18435 lod: recover layout generation from replay

The offset of the layout generation is different between struct
lov_mds_md_v1/v3.lmm_layout_gen and lov_comp_md.lcm_layout_gen.
When checking/setting layout gen, we must use layout-specific field.

Otherwise layout generation can be set to 0 (or other random value)
after replay and client can't apply new layout during later update.

Lustre-change: https://review.whamcloud.com/56950
Lustre-commit: 1d8a667073b9ef59b6c430642805efec91546ecf
Fixes: 13557aa86904 ("LU-15300 mdt: refresh LOVEA with LL granted")
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I5e4a63cd097d157317e0e8d1a0fca4a46817d118
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/56989
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
5 months agoLU-18387 kernel: update RHEL 9.5 [5.14.0-503.11.1.el9_5] 98/56998/2
Jian Yu [Wed, 13 Nov 2024 06:14:13 +0000 (22:14 -0800)]
LU-18387 kernel: update RHEL 9.5 [5.14.0-503.11.1.el9_5]

Update RHEL 9.5 kernel to 5.14.0-503.11.1.el9_5 for Lustre client.

Lustre-change: https://review.whamcloud.com/56997
Lustre-commit: TBD (from 929971901f8ca3f90fe593005002865327b137dd)

Test-Parameters: trivial \
  mdtcount=4 mdscount=2 clientdistro=el9.5 testlist=sanity
Test-Parameters: optional clientdistro=el9.5 testgroup=full-part-1
Test-Parameters: optional clientdistro=el9.5 testgroup=full-part-2
Test-Parameters: optional clientdistro=el9.5 testgroup=full-part-3

Change-Id: I9bc6924c4a71f743acd9df99042df23fdf614593
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/56998
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Alex Deiter <adeiter@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-17974 quota: fix qmt_pool_lqes_lookup_spec 36/55536/3
Sergey Cheremencev [Wed, 13 Nov 2024 08:21:19 +0000 (00:21 -0800)]
LU-17974 quota: fix qmt_pool_lqes_lookup_spec

Return 0 from qmt_pool_lqes_lookup_spec if
between found lqes exists global lqe. And
return -ENOENT if
* no lqes have been found
* no global lqe between found lqes
This patch aimed to prevent below panic:

 (qmt_lock.c:957:qmt_id_lock_notify())
ASSERTION( lqe->lqe_is_global ) failed:
 (qmt_lock.c:957:qmt_id_lock_notify()) LBUG
 ...
 Call Trace TBD:
 libcfs_call_trace+0x6f/0xa0 [libcfs]
 lbug_with_loc+0x3f/0x70 [libcfs]
 qmt_id_lock_notify+0x1ee/0x330 [lquota]
 qmt_site_recalc_cb+0x34b/0x550 [lquota]
 cfs_hash_for_each_tight+0x122/0x310 [libcfs]
 qmt_pool_recalc+0x375/0xa80 [lquota]
 kthread+0x134/0x150
 ret_from_fork+0x35/0x40
 Kernel panic - not syncing: LBUG

Lustre-change: https://review.whamcloud.com/55535
Lustre-commit: c97b327758f06f6bf3229126e9aa7b36865e7b92

Signed-off-by: Sergey Cheremencev <scherementsev@ddn.com>
Change-Id: I62a2175b7b05c49f28b4e87c36ed653d1b9a71cc
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55536
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-16639 misc: cleanup concole messages 27/56727/4
Andreas Dilger [Wed, 13 Nov 2024 02:16:20 +0000 (18:16 -0800)]
LU-16639 misc: cleanup concole messages

The lprocfs_job_cleanup() was not properly dropping all jobstats
from the hash table and printing errors from job_stat_exit() at
unmount.  Ensure all stats are "old enough" when @clear is set.

Change early libcfs cfs_cpu_init() messages from CERROR() to
pr_err() to avoid circular dependencies on libcfs setup before
printing an error message to the console during module init.

Lustre-commit: 8f40a3d7110da1af8e310a4b7f40b86f13080938
Lustre-change: https://review.whamcloud.com/50283

Test-Parameters: trivial
Fixes: ea2cd3af7b ("LU-11407 obdclass: add start time to stats files")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ide3f502103392a79419cc1836200bf5a1a3ebbe5
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Feng Lei <flei@whamcloud.com>
Signed-off-by: Eric Carbonneau <carbonneau1@llnl.gov>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/56727
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-13308 mdc: support additional flags for OBD_IOC_CHLG_POLL ioctl 82/54982/3
James Simmons [Thu, 2 May 2024 01:31:53 +0000 (21:31 -0400)]
LU-13308 mdc: support additional flags for OBD_IOC_CHLG_POLL ioctl

Currently the mdc kernel code expects the flag argument for
OBD_IOC_CHLG_POLL ioctl to only be CHANGELOG_FLAG_FOLLOW. With
IPv6 we need to send a request to the kernel to present the NID
in the struct lnet_nid format since we can't just send large NIDs
to user land if we are using older tools.

With the newer user land tools we will be sending an expanded flag
which the current kernel changelog code can't handle. Rework the
code to support the new flag if we end up with the case of newer
user land tools and an older kernel. This code will also maintain
backwards compatiblity with the older user land tools.

Lustre-change: https://review.whamcloud.com/52361
Lustre-commit: 8320394725180b76e76f36b8a513f3c7bf11e65c

Change-Id: I26a80d30ce2ebf2075a2a8f510ff81d6b0b8d848
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52361
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54982
Reviewed-by: Aurelien Degremont <adegremont@nvidia.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
5 months agoLU-17152 tests: unmount NFS clients with zconf_umount_clients 86/56886/2
Jian Yu [Tue, 5 Nov 2024 04:35:27 +0000 (20:35 -0800)]
LU-17152 tests: unmount NFS clients with zconf_umount_clients

This patch fixes cleanup_nfs() to unmount NFS clients by running
zconf_umount_clients(), which can find and kill active processes
that are accessing the NFS mount point so as to avoid the
"device is busy" failure.

The patch also adds racer_on_nfs test into always_except list for
parallel-scale-nfsv4 due to LU-17154.

Lustre-change: https://review.whamcloud.com/52533
Lustre-commit: 563deecae0ac2690b6d8d5571bf7af09408943cd

Test-Parameters: trivial testlist=parallel-scale-nfsv4

Change-Id: I37a38502362399540c28e78d1343e768b490ce8b
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/56886
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Deiter <adeiter@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-12706 tests: sanity-quota 4a sync timeout fix 10/56910/2
Sergey Cheremencev [Wed, 6 Nov 2024 20:56:58 +0000 (12:56 -0800)]
LU-12706 tests: sanity-quota 4a sync timeout fix

Don't sync all OSTs in a system - this might take
too much time. Instead, set striping only on OST0000
and sync only MDTs and OST0000. This fix is against
the following failure:

  FAIL: Passed grace time 20, 15669105271566910563

Lustre-change: https://review.whamcloud.com/55216
Lustre-commit: 9e7b239bbd26b601127073bb0c6789cb9def7073

Signed-off-by: Sergey Cheremencev <scherementsev@ddn.com>
Change-Id: I525e6c73c6d14a126a2bde7d92bc28f11f3c78c8
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/56910
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-10733 tests: increase conf-sanity/106 OST size 18/56918/2
Andreas Dilger [Thu, 7 Nov 2024 20:38:57 +0000 (12:38 -0800)]
LU-10733 tests: increase conf-sanity/106 OST size

conf-sanity test_106 is trying to create ~64k files, but OST0000
only has about 48k objects in this case, so the file creates are
failing during the test.  This makes the test somewhat unreliable
and hitting errors not related to what was originally intended
(llog wrap handling).

Increase the OSTSIZE for this test to handle the number of objects
needed by the test so it can run more reliably.

Lustre-change: https://review.whamcloud.com/50732
Lustre-commit: 334d780617561c66c91697fb1681ce24b5379387

Test-Parameters: trivial
Test-Parameters: testlist=conf-sanity env=ONLY=106
Test-Parameters: testlist=conf-sanity env=ONLY=106
Test-Parameters: testlist=conf-sanity env=ONLY=106
Test-Parameters: testlist=conf-sanity env=ONLY=106
Test-Parameters: testlist=conf-sanity env=ONLY=106
Test-Parameters: testlist=conf-sanity env=ONLY=106

Test-Parameters: optional env=SLOW=yes,ENABLE_QUOTA=yes \
  clientdistro=el8.9 serverdistro=el8.10 testlist=conf-sanity

Test-Parameters: optional env=SLOW=yes,ENABLE_QUOTA=yes \
  clientdistro=ubuntu2204 serverdistro=el8.9 testlist=conf-sanity

Test-Parameters: optional env=SLOW=yes,ENABLE_QUOTA=yes \
  clientdistro=sles15sp5 serverdistro=el8.9 testlist=conf-sanity

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ie33825801172ea565d9d1d5fb81595d2cad65677
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/56918
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
5 months agoLU-18407 tests: check Lustre-patched filefrag 99/56899/2
Jian Yu [Wed, 6 Nov 2024 00:17:31 +0000 (16:17 -0800)]
LU-18407 tests: check Lustre-patched filefrag

In Lustre test suites, there are some subtests using filefrag
from Lustre-patched e2fsprogs. This patch adds checks in those
subtests to skip them if the Lustre-patched e2fsprogs is not
installed on Lustre client.

Test-Parameters: trivial
Test-Parameters: env=ONLY="228" clientdistro=ubuntu2204 testlist=sanity-hsm
Test-Parameters: env=ONLY="24a" clientdistro=ubuntu2204 testlist=sanity-pfl
Test-Parameters: env=ONLY="56" clientdistro=ubuntu2204 testlist=sanity-sec
Test-Parameters: env=ONLY="77n 130" clientdistro=ubuntu2204 testlist=sanity

Change-Id: I86e2edd18052ff7fb19e7cbcbb660aa383824372
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/56899
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-7665 test: improve sanity 300p 41/56941/3
Lai Siyao [Fri, 8 Nov 2024 19:14:26 +0000 (11:14 -0800)]
LU-7665 test: improve sanity 300p

Sanity test 300p set OBD_FAIL_OUT_ENOSPC once, but it may fail llog
operation (not critical), therefore subsequent mkdir succeeds. Change
the fail_loc to always fail so the test can be more robust.

Lustre-change: https://review.whamcloud.com/54625
Lustre-commit: ac04484c1beec9f46d1256e8ea236f24073344af

Test-Parameters: trivial
Test-Parameters: mdscount=2 mdtcount=4 testlist=sanity
Test-Parameters: mdscount=2 mdtcount=4 testlist=sanity
Test-Parameters: mdscount=2 mdtcount=4 testlist=sanity
Test-Parameters: mdscount=2 mdtcount=4 testlist=sanity
Test-Parameters: mdscount=2 mdtcount=4 testlist=sanity
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I128ce39aaf97e1785a8c135a696d0b404b48a2a8
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/56941
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-10994 test: remove netdisk from obdfilter-survey 13/49413/6
John L. Hammond [Mon, 11 Nov 2024 23:16:19 +0000 (15:16 -0800)]
LU-10994 test: remove netdisk from obdfilter-survey

Remove the netdisk case from obdfilter-survey. Remove subtests that
use echo_client over osc devices.

Lustre-change: https://review.whamcloud.com/47239
Lustre-commit: 51c491dac6aec99fc328732b4358e8d5732dc230

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: I260001241cee3027f68e62077e5817221bd0c08b
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49413
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-17985 osd-ldiskfs: drop osd object if failed to create 40/56940/2
Hongchao Zhang [Fri, 8 Nov 2024 19:07:35 +0000 (11:07 -0800)]
LU-17985 osd-ldiskfs: drop osd object if failed to create

In osd_create, if the newly created inode had already contained
correct XATTR_NAME_LMA but failed to update the OI, it will clear
osd_object->oo_inode, the osd_object should also be dropped.

Lustre-change: https://review.whamcloud.com/55571
Lustre-commit: 40e27b4251bec6d60ce0a6310a5ac7094980f9a3

Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Change-Id: I4ff5952c154ce459c78514b88b1810471635c703
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/56940
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-15496 tests: fix sanity/398c to use proper OSC name 77/56977/2
Andreas Dilger [Mon, 11 Nov 2024 19:16:00 +0000 (11:16 -0800)]
LU-15496 tests: fix sanity/398c to use proper OSC name

For ppc64le and aarch64 clients, the OSC import instance name does
not have "ffff" at the start, so use the proper device name for this
subtest.

Clean up the rest of test_398c to meet modern test code style.

Lustre-change: https://review.whamcloud.com/55132
Lustre-commit: b1b57bcadeeb5a87ac75387c4aa4ae084e1a27e0

LU-15496 tests: add debugging to sanity/398c

Dump the rpc_stats to help understand why the test is failing.

Lustre-change: https://review.whamcloud.com/53462
Lustre-commit: 304ca31e2aa15c576e468a86e45d8817c8eca391

Test-Parameters: trivial testlist=sanity clientarch=ppc64le env=ONLY=398c,ONLY_REPEAT=100

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: If8c72fa9b13eace009f39daf82454221eba6761b
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/56977
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-14472 quota: skip non-exist or inact tgt for lfs_quota 42/56942/2
Hongchao Zhang [Fri, 8 Nov 2024 19:25:58 +0000 (11:25 -0800)]
LU-14472 quota: skip non-exist or inact tgt for lfs_quota

The nonexistent or inactive targets (MDC or OSC) should be skipped
for "lfs quota".

Lustre-change: https://review.whamcloud.com/41771
Lustre-commit: b54b7ce43929ce7ff6e48cd219623c264ca6b6b3

Change-Id: I25eece413715e4e05dd94ccbfd101220da7477f9
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Feng, Lei <flei@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/56942
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Feng Lei <flei@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-15839 tests: correct the ZFS grace time for sanity-quota 4a 27/56927/3
Etienne AUJAMES [Fri, 8 Nov 2024 19:03:02 +0000 (11:03 -0800)]
LU-15839 tests: correct the ZFS grace time for sanity-quota 4a

For  sanity-quota 4a, the grace time is increased from 12s to 20s but
not actually set on filesystem.

Lustre-change: https://review.whamcloud.com/47289
Lustre-commit: 8f306f00c02e5455cef48d227f28e8cb90127719

Fixes: 3e4c3fdc ("LU-6836 test: re-add test 4a to sanity-quota for ZFS")
Test-Parameters: fstype=zfs testlist=sanity-quota env=ONLY=4a,ONLY_REPEAT=100
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Change-Id: I2324e818a42a19bc9928f127b1622f1e5274db1f
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/56927
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-15055 lod: run qmt_pool_* only from the MDT0000 config 39/56939/2
Etienne AUJAMES [Fri, 8 Nov 2024 18:51:38 +0000 (10:51 -0800)]
LU-15055 lod: run qmt_pool_* only from the MDT0000 config

On the first mds (with MDT0000/QMT0000), if there is more than one MDT
target, qmt_pool_{new/del/rem/add} functions will be call several
times on QMT0000 for the same pool.

This resulting to the following error in dmseg:
LustreError: 5659:0:(qmt_pool.c:1390:qmt_pool_add_rem()) add to: can't
scratch-QMT0000 scratch-OST0000_UUID pool pool1: rc = -17

This patch run qmt_pool_* only from a record config from the MDT0000.
The qmt_pool_add_rem() dmesg error is checked on sanity-quota test_1b.

Lustre-change: https://review.whamcloud.com/47059
Lustre-commit: 0f158c6a093e059d89f637f31d34742078c38209

Test-Parameters: mdtcount=2 mdscount=1 testlist=sanity-quota
Fixes: 09f9fb32 ("LU-11023 quota: quota pools for OSTs")
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Change-Id: Ia6b712abe25a4d68770753e3408c3321181db1aa
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/56939
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sergey Cheremencev <scherementsev@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-17750 kernel: update SLES15 SP4 [5.14.21-150400.24.100.2] 92/55192/8
Jian Yu [Tue, 5 Nov 2024 20:31:10 +0000 (12:31 -0800)]
LU-17750 kernel: update SLES15 SP4 [5.14.21-150400.24.100.2]

Update SLES15 SP4 kernel to 5.14.21-150400.24.100.2 for Lustre client.

Lustre-change: https://review.whamcloud.com/54823
Lustre-commit: 4cdabc2c25f71ed968d8c2300d3b717e3160d46e

Test-Parameters: trivial env=SANITY_EXCEPT="154b" \
  mdtcount=4 mdscount=2 clientdistro=sles15sp4 testlist=sanity

Test-Parameters: optional clientdistro=sles15sp4 testgroup=full-part-1
Test-Parameters: optional clientdistro=sles15sp4 testgroup=full-part-2
Test-Parameters: optional clientdistro=sles15sp4 testgroup=full-part-3

Change-Id: I401e97f602e6c8c62fac73e3603eb0226745bba1
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55192
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Alex Deiter <adeiter@ddn.com>
5 months agoLU-18423 kernel: update RHEL 8.10 [4.18.0-553.27.1.el8_10] 95/56895/2
Jian Yu [Tue, 5 Nov 2024 20:24:09 +0000 (12:24 -0800)]
LU-18423 kernel: update RHEL 8.10 [4.18.0-553.27.1.el8_10]

Update RHEL 8.10 kernel to 4.18.0-553.27.1.el8_10.

Lustre-change: https://review.whamcloud.com/56888
Lustre-commit: TBD (from b084d5534a15741094a51ee40c9a1d5e9cfbf5e1)

Test-Parameters: trivial fstype=ldiskfs mdtcount=4 mdscount=2 \
  clientdistro=el8.10 serverdistro=el8.9 testlist=sanity

Test-Parameters: trivial fstype=zfs mdtcount=4 mdscount=2 \
  clientdistro=el8.10 serverdistro=el8.9 testlist=sanity

Test-Parameters: trivial fstype=ldiskfs mdtcount=4 mdscount=2 \
  clientdistro=el8.9 serverdistro=el8.10 testlist=sanity

Test-Parameters: trivial fstype=zfs mdtcount=4 mdscount=2 \
  clientdistro=el8.9 serverdistro=el8.10 testlist=sanity

Test-Parameters: optional clientdistro=el8.10 serverdistro=el8.10 \
  testgroup=full-part-1

Test-Parameters: optional clientdistro=el8.10 serverdistro=el8.10 \
  testgroup=full-part-2

Test-Parameters: optional clientdistro=el8.10 serverdistro=el8.10 \
  testgroup=full-part-3

Change-Id: I3737c1f1b2941d2095225f1ab80fd76768c4782c
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/56895
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-18414 kernel: update RHEL 9.4 [5.14.0-427.42.1.el9_4] 79/56879/3
Jian Yu [Tue, 5 Nov 2024 20:07:56 +0000 (12:07 -0800)]
LU-18414 kernel: update RHEL 9.4 [5.14.0-427.42.1.el9_4]

Update RHEL 9.4 kernel to 5.14.0-427.42.1.el9_4 for Lustre client.

Lustre-change: https://review.whamcloud.com/56845
Lustre-commit: TBD (from 72b19d2215d4b476faf5d5b0a955ce5c22873f86)

Test-Parameters: trivial \
  mdtcount=4 mdscount=2 clientdistro=el9.4 testlist=sanity
Test-Parameters: optional clientdistro=el9.4 testgroup=full-part-1
Test-Parameters: optional clientdistro=el9.4 testgroup=full-part-2
Test-Parameters: optional clientdistro=el9.4 testgroup=full-part-3

Change-Id: Ib1b95bcaf35a9f8ed80fe7a33b51127086dd412c
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/56879
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-17573 lov: change default object size. 00/56100/2
Alexey Lyashkov [Thu, 22 Feb 2024 06:38:03 +0000 (09:38 +0300)]
LU-17573 lov: change default object size.

OST don't able to use indirects for long time,
let's switch a object size to extent based.

Lustre-commit: f315a3a594a78ecd47fcd74177fa73fb2efff59c
Lustre-change: https://review.whamcloud.com/54137

Test-Parameters: trivial
HPe-bug-id: LUS-11428
Signed-off-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Signed-off-by: Eric Carbonneau <carbonneau1@llnl.gov>
Change-Id: I9759fc7122c41075ebc35d52ade342c37706b041
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/56100
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-17421 build: Update check for arc_prune_func_t parameters 19/54819/6
Brian Atkinson [Fri, 12 Jan 2024 00:36:59 +0000 (17:36 -0700)]
LU-17421 build: Update check for arc_prune_func_t parameters

In OpenZFS 2.2.1 the code for arc_prune_async() was unified so that
FreeBSD and Linux did not have their own implementation versions of
the same code. Part of this update changed first parameter for the
arc_prune_func_t to be an uint64_t.

Without this patch, Lustre would not build with ZFS 2.2.1 because of
a failure for incompatible pointer types for the arc_prunte_func_t
function pointer passed to arc_add_prune_callback().

Lustre-change: https://review.whamcloud.com/53664
Lustre-commit: 303cfe3372349974ff7cd610ad878b618ce4ee29

Test-Parameters: trivial
Signed-off-by: Brian Atkinson <batkinson@lanl.gov>
Signed-off-by: Eric Carbonneau <carbonneau1@llnl.gov>
Change-Id: Iaa03cc9421f27a8517ce04817f04102de9adb86a
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Akash B <akash-b@hpe.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54819
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
5 months agoLU-16791 utils: ZFS 2.2 const prop args 18/54818/9
Brian Atkinson [Tue, 26 Sep 2023 18:35:43 +0000 (12:35 -0600)]
LU-16791 utils: ZFS 2.2 const prop args

ZFS 2.2 now expects const char * from certain interfaces in
sys/nvpair.h. I updated the build system to detect if this is the case
and if so update the paramters passed to certain functions in
libmount_utils_zfs.c to account for these changes.

Without this patch, Lustre master would not build with ZFS master and
the 2.2 release candidates.

Lustre-change: https://review.whamcloud.com/52519
Lustre-commit: b4b32ffd22d276bc1d8f40e3336df982f3717070

Test-Parameters: trivial testgroup=review-dne-zfs-part-1
Test-Parameters: testgroup=review-dne-zfs-part-2
Test-Parameters: testgroup=review-dne-zfs-part-3
Test-Parameters: testgroup=review-dne-zfs-part-4
Test-Parameters: testgroup=review-dne-zfs-part-5
Test-Parameters: testgroup=review-dne-zfs-part-6
Test-Parameters: testgroup=review-dne-zfs-part-7
Signed-off-by: Brian Atkinson <batkinson@lanl.gov>
Signed-off-by: Eric Carbonneau <carbonneau1@llnl.gov>
Change-Id: I0469eeff6dafa6c276fc616381530b6b679d9da1
Reviewed-by: Akash B <akash-b@hpe.com>
Reviewed-by: Thomas Bertschinger <bertschinger@lanl.gov>
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54818
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-15689 libcfs: libcfs_debug_mb set incorrectly on init 38/54538/3
Chris Horn [Wed, 23 Mar 2022 06:21:06 +0000 (01:21 -0500)]
LU-15689 libcfs: libcfs_debug_mb set incorrectly on init

If libcfs_debug_mb parameter is specified to insmod (i.e. set before
module is initialized) then it does not get initialized correctly.

libcfs_param_debug_mb_set() expects cfs_trace_get_debug_mb() to return
zero if the module has not been initialized yet, but
cfs_trace_get_debug_mb() will return 1 in this case. Modify
cfs_trace_get_debug_mb() to return zero as expected. A related issue
is that in this case we need to call cfs_trace_get_debug_mb() after
cfs_tracefile_init() so that libcfs_debug_mb gets the same value it
would get if we had set it after module init.

When libcfs_debug_mb is specified to insmod, libcfs_debug_init()
divides its value by num_possible_cpus(), but this is already done in
libcfs_param_debug_mb_set().

Lustre-change: https://review.whamcloud.com/c/46925/
Lustre-commit: d38ef181d8250b083553ec95209c28c1dc11fa99

Test-Parameters: trivial
Fixes: 8b78a3ffb5 ("LU-9859 libcfs: always range-check libcfs_debug_mb setting.")
HPE-bug-id: LUS-10839
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I1003758156acb5cf6ea30bbdfd7b45a743a2a5aa
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: Gian-Carlo DeFazio <defazio1@llnl.gov>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54538
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
5 months agoLU-15618 lnet: Return ESHUTDOWN in lnet_parse() 59/49259/3
Chris Horn [Thu, 3 Mar 2022 07:12:32 +0000 (01:12 -0600)]
LU-15618 lnet: Return ESHUTDOWN in lnet_parse()

If the peer NI lookup in lnet_parse() fails with ESHUTDOWN then we
should return that value back to the LNDs so that they can treat the
failed call the same way as other lnet_parse() failures.

Returning zero results in at least one bug in socklnd where a
reference on a ksock_conn can be leaked which prevents socklnd from
shutting down.

Lustre-change: https://review.whamcloud.com/46711
Lustre-commit: 4fbd0705a3d25bbc85e953f81e697e5006b215ce

Fixes: 47b7b31978 ("LU-8106 lnet: Do not drop message when shutting down LNet")
Test-Parameters: trivial testlist=sanity-lnet
HPE-bug-id: LUS-15794
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Change-Id: Ic403619c6dccf3921c46a674808c404adad7a30e
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Andriy Skulysh <andriy.skulysh@hpe.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49259
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
5 months agoLU-12511 llite: use mapping_set_error instead of opencoded set_bit 53/55553/4
Michal Hocko [Thu, 11 Jul 2024 21:05:26 +0000 (17:05 -0400)]
LU-12511 llite: use mapping_set_error instead of opencoded set_bit

The mapping_set_error() helper sets the correct AS_ flag for the mapping
so there is no reason to open code it.  Use the helper directly.

[akpm@linux-foundation.org: be honest about conversion from -ENXIO to -EIO]
Link: http://lkml.kernel.org/r/20160912111608.2588-2-mhocko@kernel.org
Linux-commit: 5114a97a8bce7f4ead29a32b67dee85438699b9e

Lustre-change: https://review.whamcloud.com/51372
Lustre-commit: aac625055e50e83d7716bdfc6ecfab3282eb0ad2

Change-Id: I153bc04d4745a20013820ba81572cadb37ab8f39
Signed-off-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51372
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55553
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
5 months agoLU-17440 lnet: prevent errorneous decref for asym route 06/54906/4
Gian-Carlo DeFazio [Thu, 29 Feb 2024 00:44:48 +0000 (16:44 -0800)]
LU-17440 lnet: prevent errorneous decref for asym route

The following stack trace was seen on a lustre server:
Call Trace TBD:
[<0>] libcfs_call_trace+0x6f/0xa0 [libcfs]
[<0>] lbug_with_loc+0x3f/0x70 [libcfs]
[<0>] lnet_destroy_peer_ni_locked+0x44d/0x4e0 [lnet]
[<0>] lnet_handle_find_routed_path+0x86c/0xee0 [lnet]
[<0>] lnet_select_pathway+0xb95/0x16c0 [lnet]
[<0>] lnet_send+0x6d/0x1e0 [lnet]
[<0>] lnet_parse_local+0x3ed/0xdd0 [lnet]
[<0>] lnet_parse+0xd7d/0x1490 [lnet]
[<0>] kiblnd_handle_rx+0x30e/0x900 [ko2iblnd]
[<0>] kiblnd_scheduler+0x104b/0x10d0 [ko2iblnd]
[<0>] kthread+0x14c/0x170
[<0>] ret_from_fork+0x1f/0x40

It was discovered that the lnet routes between the server
and a client cluster were misconfigured, so that the clients
had routes to the server through all 8 available routers,
but the server had routes to the clients through only 7 of
the routers.

The server was contacted by a client node through the
router with the missing route. It incremented the ref count
for the corresponding struct lnet_peer_ni for that router,
but then, because it had no route through that peer, changed
the value of the struct lnet_peer_ni to a peer with a route
back to the client. It then decremented the new
struct lnet_peer_ni which resulted in the ref count being
decremented to 0 which caused an LBUG.

Detect if the peer is a router to the appropriate net.
If so, decrement its ref count at the end of the function,
if not, decrement its ref count immediately.

Lustre-change: https://review.whamcloud.com/53896
Lustre-commit: 2b210f39059be998b80b0acc13c12451960b63bb

Fixes: 2e27193 ("LU-17062 lnet: Update lnet_peer_*_decref_locked usage")
Test-Parameters: testlist=sanity-lnet mdscount=1 osscount=2 clientcount=1
Signed-off-by: Gian-Carlo DeFazio <defazio1@llnl.gov>
Change-Id: I2d00faef60ae8768afa7afbb1b00a62ba90535bb
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54906
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-18345 test: interop check for sanity-quota 2 88/56688/4
Hongchao Zhang [Mon, 30 Sep 2024 15:41:18 +0000 (23:41 +0800)]
LU-18345 test: interop check for sanity-quota 2

The "least qunit" had been renamed to "least_qunit" in 2.15.51,
adding interop handling for it.

Test-Parameters: trivial
Test-Parameters: testlist=sanity-quota env=ONLY=2 serverjob=lustre-master serverbuildno=4586
Fixes: cd1847e73e ("LU-14535 quota: improve quota output format")
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Change-Id: I1a2cbe66280c2165e0da78ca93605113f9d8e974
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/56688
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
5 months agoLU-18344 test: sanity test_247f interop fix 33/56733/2
Lai Siyao [Sun, 29 Sep 2024 18:04:25 +0000 (14:04 -0400)]
LU-18344 test: sanity test_247f interop fix

2.16 always enables remote subdir mount, update sanity test_247f.

Test-Parameters: trivial
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Ibe04d307a5596a6047d5fd301e19c33bf07f1e21
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/56733
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-16390 tests: check Lustre filefrag in sanity-flr/49a 05/56605/2
Andreas Dilger [Tue, 8 Oct 2024 00:18:13 +0000 (17:18 -0700)]
LU-16390 tests: check Lustre filefrag in sanity-flr/49a

Check that a Lustre-patched filefrag is installed when running
sanity-flr test_49a.

Lustre-change: https://review.whamcloud.com/49386
Lustre-commit: 37f18670e49b8150170f9b724b5f7089fa176c4e

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ic909ea4ca160d47480004f53a96ce7539ce5076c
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/56605
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Deiter <adeiter@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-18341 tests: skip sanity-flr/test_36 for new servers 93/56693/5
Bobi Jam [Tue, 15 Oct 2024 10:49:54 +0000 (18:49 +0800)]
LU-18341 tests: skip sanity-flr/test_36 for new servers

2.16 servers allows layout version update from client while 2.15
does not allow it, so we'd skip sanity-flr/test_36 which would
check this behavior.

Test-Parameters: trivial
Test-Parameters: testlist=sanity-flr env=ONLY=36 serverjob=lustre-master serverbuildno=4586
Fixes: fa6574150b ("LU-14642 flr: allow layout version update from client/MDS")
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I50d81922217b8a864053ba8781f4627f02410717
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/56693
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
5 months agoLU-18387 kernel: new kernel [RHEL 9.5 5.14.0-503.2.1.el9_5] 54/56754/6
Shaun Tancheff [Wed, 30 Oct 2024 17:25:58 +0000 (10:25 -0700)]
LU-18387 kernel: new kernel [RHEL 9.5 5.14.0-503.2.1.el9_5]

This patch makes changes to support new RHEL 9.5 release
for Lustre client.

Lustre-change: https://review.whamcloud.com/56748
Lustre-commit: TBD (from a347e8bece92e00af02d5499b092700954c4fb8e)

LU-17243 build: compatibility updates for kernel 6.6

linux kernel v5.19-rc1-4-gc4f135d64382
  workqueue: Wrap flush_workqueue() using a macro
linux kernel v6.5-rc1-7-g20bdedafd2f6
  workqueue: Warn attempt to flush system-wide workqueues.
If __flush_workqueue(system_wq) is not available fall back to
flush_scheduled_work()

Lustre-change: https://review.whamcloud.com/52908
Lustre-commit: a0e6d6f7327598d13661bb14098a9f21f2035285

LU-17592 build: compatibility updates for kernel 6.8

Linux commit v6.7-rc1-3-gda549bdd15c2
  dentry: switch the lists of children to hlist
Provide trival wrappers to abstract the changed members

Lustre-change: https://review.whamcloud.com/54229
Lustre-commit: 6d27c2c8c72e853a238fd3fc7f42d658188ca02f

Test-Parameters: trivial \
  mdtcount=4 mdscount=2 clientdistro=el9.5 testlist=sanity
Test-Parameters: optional clientdistro=el9.5 testgroup=full-part-1
Test-Parameters: optional clientdistro=el9.5 testgroup=full-part-2
Test-Parameters: optional clientdistro=el9.5 testgroup=full-part-3

Change-Id: I1bce12b2b7190bcbd880916049667630aba700c8
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/56754
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-17696 llite: remove LASSERT from ll_ddelete() 30/56830/2
Jian Yu [Wed, 30 Oct 2024 17:21:57 +0000 (10:21 -0700)]
LU-17696 llite: remove LASSERT from ll_ddelete()

On Linux kernel 6.8, the changes in commit 2f42f1eb9093
("Call retain_dentry() with refcount 0") made d_delete()
instances called for dentries with ->d_lock held and
refcount equal to 0, which caused the following assertion
failure on Lustre client:

(dcache.c:136:ll_ddelete()) ASSERTION( d_count(de) == 1 ) failed

The value of d_count(de) became 0 instead of 1. Since
retain_dentry() was called either with refcount 0 or 1,
we can simply remove the LASSERT(ll_d_count(de) == 1)
from ll_ddelete() to avoid the above failure.

Lustre-change: https://review.whamcloud.com/54676
Lustre-commit: 0176629ab3f71e88850ab95796b0e519c4d0f740

Change-Id: Ic4a39d9328326634190cd0719b4c0637e1bf315c
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/56830
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-16520 build: Move strscpy to libcfs common header 66/56766/4
Shaun Tancheff [Wed, 23 Oct 2024 06:25:55 +0000 (23:25 -0700)]
LU-16520 build: Move strscpy to libcfs common header

Ensure strscpy is available to lustre

Lustre-change: https://review.whamcloud.com/49863
Lustre-commit: 7fe7f4ca06b9c8d128f7ba36988e36f8141ed53d

Test-Parameters: trivial
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I0c3673c2aa7e6b61671521a8cabde8a364f7f6f8
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/56766
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-9859 libcfs: migrate libcfs_mem.c to lnet/lib-mem.c 65/56765/6
James Simmons [Fri, 25 Oct 2024 19:51:04 +0000 (12:51 -0700)]
LU-9859 libcfs: migrate libcfs_mem.c to lnet/lib-mem.c

Move the libcfs_mem.c code to the LNet core. The prototypes are declared in libcfs_cpu.h
but we don't move them yet since the CPT code depends on the libcfs_mem.c work. This can
end up in a modular cyclic dependency if we move the CPT work right away so limit what is
changed at this point.

Lustre-change: https://review.whamcloud.com/52701
Lustre-commit: 24d515367f44de6b92b453cc9a1c8384e52b5e3f

LU-9859 lnet: move CPT handling to LNet

The CPT work is used for LNet and ptlrpc which is the Lustre LNet
interface. Move this work there and merge the lib-mem.c code as
well since they both work closely together. Move cpt debugfs
handling from libcfs to lnet. Now all remaining debugfs in libcfs
is for debugging.

Lustre-change: https://review.whamcloud.com/52923
Lustre-commit: 7f8cde3b77ada95e8b96dee1996f8d40bd17a538

LU-9859 libcfs: remove workitem.

There are no more users of the "workitem" code so it can be removed.
Lustre uses Linux workqueues instead.

Lustre-change: https://review.whamcloud.com/50462
Lustre-commit: 1782884fa247da0c1400ee6307596b64d6aaa440

Test-Parameters: trivial
Change-Id: I6bf5cd9f20033f988dde1989f0fc5f89ea74b5a2
Signed-off-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/56765
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-9859 libcfs: move percpt_lock into lnet 93/56793/2
Mr NeilBrown [Fri, 25 Oct 2024 18:57:43 +0000 (11:57 -0700)]
LU-9859 libcfs: move percpt_lock into lnet

lnet is the only users of percpt_lock - and there are only two such
locks!
So move the code into lnet, as part of deprecating libcfs.

Lustre-change: https://review.whamcloud.com/50832
Lustre-commit: c4e2563ff3bfa84ab7558c2aced32445da543ef6

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Id7091e88cf61228aa031921747fb9c7b08214931
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/56793
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-9859 lnet: convert selftest to use workqueues 74/56774/2
Mr NeilBrown [Thu, 24 Oct 2024 00:29:39 +0000 (17:29 -0700)]
LU-9859 lnet: convert selftest to use workqueues

Instead of the cfs workitem library, use workqueues.

As lnet wants to provide a cpu mask of allowed cpus, it
needs to be a WQ_UNBOUND work queue so that tasks can
run on cpus other than where they were submitted.
We use alloc_ordered_workqueue for lst_sched_serial (now called
lst_serial_wq) - "ordered" means the same as "serial" did.
We use cfs_cpt_bind_queue() for the other workqueues which sets up the
CPU mask as required.

An important difference with workqueues is that there is no equivalent
to cfs_wi_exit() which can be called in the action function and which
will ensure the function is not called again - and that the item is no
longer queued.

To provide similar semantics we treat swi_state == SWI_STATE_DONE as
meaning that the wi is complete and any further calls must be no-op.
We also call cancel_work_sync() (via swi_cancel_workitem()) before
freeing or reusing memory that held a work-item.

To ensure the same exclusion that cfs_wi_exit() provided the state is
set and tested under a lock - either crpc_lock, scd_lock, or tsi_lock
depending on which structure the wi is embedded in.

Another minor difference is that with workqueues the action function
returns void, not an int.

Also change SWI_STATE_* from #define to an enum.  The only place these
values are ever stored is in one field in a struct.

Linux-commit: 6106c0f82481e686b337ee0c403821fb5c3c17ef
Linux-commit: 3fc0b7d3e0a4d37e4c60c2232df4500187a07232
Linux-commit: 7d70718de014ada7280bb011db8655e18ed935b1

Lustre-change: https://review.whamcloud.com/36991
Lustre-commit: 51dd6269c91dab7543cd9dfd1848c983efa6db36

Test-Parameters: trivial testlist=lnet-selftest
Change-Id: I5ccf1399ebbfdd4cab3696749bd1ec666147b757
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/56774
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-9859 libcfs: move kernel specific code out of libcfs core 63/56763/2
James Simmons [Wed, 23 Oct 2024 00:29:11 +0000 (17:29 -0700)]
LU-9859 libcfs: move kernel specific code out of libcfs core

Over time kernel version specific code has leaked into the libcfs
core code. Move that code to the linux subdirectory code so in
the future code cleanup is not missed.

Lustre-change: https://review.whamcloud.com/52010
Lustre-commit: 8754693fe6ddac4b74e27800a05d5aea00bb0359

Test-Parameters: trivial
Change-Id: I38a00c377334066160083edd3932d4a718198497
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/56763
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 months agoLU-8130 libcfs: don't use radix tree for xarray 62/56762/2
James Simmons [Wed, 23 Oct 2024 00:19:44 +0000 (17:19 -0700)]
LU-8130 libcfs: don't use radix tree for xarray

For newer kernels the radix tree is totally based on Xarray. For Lustre
support for RHEL7 we backported Xarray but it still was using the
radix tree. Their is a mismatch between what the radix tree expects
and using a struct xa_node when allocating and freeing memory. Instead
abandon all use of the radix tree with Xarray. We use our own private
kmem cache which is based on radix tree but it uses xa_node.

Lustre-change: https://review.whamcloud.com/51840
Lustre-commit: 778791dd7da107710c2311935a24cfd7e7a5fd85

LU-17052 libcfs: fix build for old kernel

Fix build for kernel v4.17 to v4.19.
These old kernels already have xarray.h and #include by fs.h but
don't have full xarray support. It is needed to #include libcfs's
xarray.h also to contain xarray support.

Rename the header define macro to ensure libcfs's xarray.h will be
included。

Lustre-change: https://review.whamcloud.com/52090
Lustre-commit: 778791dd7da107710c2311935a24cfd7e7a5fd85

Test-Parameters: trivial
Test-Parameters: testlist=sanityn envdefinitions=ONLY=77,ONLY_REPEAT=20
Fixes: 84e12028be9a ("LU-9859 libcfs: add support for Xarray")
Fixes: 778791dd7da1 ("LU-8130 libcfs: don't use radix tree for xarray")
Signed-off-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Xinliang Liu <xinliang.liu@linaro.org>
Change-Id: I87607aa0e55a4aca039f2fef5a76fbff0bedd9b3
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/56762
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
6 months agoLU-18350 tests: skip sanityn 33c/d interop 96/56696/5
Lai Siyao [Fri, 27 Sep 2024 19:29:48 +0000 (15:29 -0400)]
LU-18350 tests: skip sanityn 33c/d interop

Skip sanityn 33c 33d interop with 2.16 since they are DNE
Commit-on-Sharing related, and are refactored in 2.16.

Test-Parameters: trivial
Test-Parameters: testlist=sanityn env=ONLY=33 mdtcount=4 serverjob=lustre-master serverbuildno=4586
Fixes: 1d6b96a1cf ("LU-15529 mdt: optimize dir migration locking")
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I7487e2d2a142517dd425281517629fc42159b8b9
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/56696
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
10 months agoNew release 2.15.5 2.15.5 v2_15_5
Oleg Drokin [Fri, 28 Jun 2024 18:07:09 +0000 (14:07 -0400)]
New release 2.15.5

Change-Id: Ibdf3d3d3d405f49a148da2fff4eb35ae50bce7dd
Signed-off-by: Oleg Drokin <green@whamcloud.com>
10 months agoNew RC 2.15.5-RC3 2.15.5-RC3 v2_15_5-RC3
Oleg Drokin [Wed, 26 Jun 2024 18:49:41 +0000 (14:49 -0400)]
New RC 2.15.5-RC3

Change-Id: I3eeee228c9747b1e09d0370235739891b220eb14
Signed-off-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-16341 quota: fix panic in qmt_site_recalc_cb 18/55518/3
Sergey Cheremencev [Fri, 24 Jun 2022 20:38:29 +0000 (23:38 +0300)]
LU-16341 quota: fix panic in qmt_site_recalc_cb

The panic occurred due to empty qit_lqes array after
qmt_pool_lqes_lookup_spec. Sometimes it is possible if
global lqe is not enforced. Return -ENOENT from
qmt_pool_lqes_lookup_spec if no lqes have been added.

It fixes following panic:

    BUG: unable to handle NULL pointer dereference at 00000000000000f8
    ...
    RIP: 0010:qmt_site_recalc_cb+0x2ec/0x780 [lquota]
    ...
    cfs_hash_for_each_tight at ffffffffc0c72c81 [libcfs]
    qmt_pool_recalc at ffffffffc142dec7 [lquota]
    kthread at ffffffffb45043a6
    ret_from_fork at ffffffffb4e00255

Add test sanity-quota_14 that reproduces above panic without the fix,
but skip it for older MDS that do not have this fix.

Lustre-change: https://review.whamcloud.com/49241
Lustre-commit: dfe7d2dd2b0d4c0c08faa613f44d2ab1f74c7420

HPE-bug-id: LUS-11007
Change-Id: Ie51396269fae7ed84379bef5fc964cce789eba7c
Signed-off-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: jsimmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55518
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sergey Cheremencev <scherementsev@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-16709 lnet: fix locking multiple NIDs of the MR peer 86/55486/3
Serguei Smirnov [Tue, 4 Apr 2023 21:02:51 +0000 (14:02 -0700)]
LU-16709 lnet: fix locking multiple NIDs of the MR peer

If Lustre identifies the same peer with multiple NIDs,
as a result of peer discovery it is possible that
the discovered peer is found to contain a NID which is locked
as primary by a different existing peer record.
In this case it is safe to merge the peer records,
but the NID which got locked the earliest should be
kept as primary.

This allows for the first of the two locked NIDs
to stay primary as intended for the purpose of communicating
with Lustre even if peer discovery succeeded
using a different NID of MR peer.

Lustre-change: https://review.whamcloud.com/50530
Lustre-commit: 3b7a02ee4d656b7b3e044713681da2f56dddb152

Fixes: aacb16191a ("LU-14668 lnet: Lock primary NID logic")
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: Iec9f8b70053fe24cddee552358500dfad0234b7f
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55486
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-17476 lnet: use bits only to match ME in all cases 89/55489/2
Serguei Smirnov [Fri, 16 Feb 2024 19:01:21 +0000 (11:01 -0800)]
LU-17476 lnet: use bits only to match ME in all cases

If NIDs belong to the same peer and matchbits are matching,
declare a match even if matchbits are matched as not available
or ignored

Lustre-change: https://review.whamcloud.com/54082
Lustre-commit: a7ae2e5515879dc31e87106314d35dc439a2c50d

Test-Parameters: testlist=sanity env=ONLY=350,ONLY_REPEAT=10
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: I394c492381a2d069b34516c473220192df05fbd2
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55489
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
10 months agoLU-17476 lnet: prefer to use bits only to match ME 88/55488/2
Serguei Smirnov [Sat, 27 Jan 2024 20:17:34 +0000 (12:17 -0800)]
LU-17476 lnet: prefer to use bits only to match ME

In some cases, it has been observed that a reply will arrive
at the portal with the correct match bits, but is dropped by
lnet_parse_put().  This appears to happen with LNet Multi-Rail
peers, each having two separate NIDs.

If a reply arrives with matchbits available and matching, but
the NIDs don't match, confirm the match if the NIDs are found
to belong to the same peer.  This will only happen in cases
where the reply would be dropped entirely, causing hundreds of
seconds of delay until the RPC is resent, so the extra overhead
of checking for a peer match before dropping the request is
only in the error path and minimal compared to the alternative.

Add CFS_FAIL_CHECK() for exercising the match NIDs code.

That is in a hot codepath, but CFS_FAIL_CHECK() is marked unlikely()
and this check is in the error case and _should_ only be hit when
the message would have been dropped anyway, so it seems unlikely to
impact performance in any meaningful way.

Lustre-change: https://review.whamcloud.com/53843
Lustre-commit: 0b61b7d6d7940f67b75db2f4747169478512dd09

Test-Parameters: testlist=sanity env=ONLY=350,ONLY_REPEAT=10
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I10e1a2142539ddf5dabc26ce962cec1f2cfcf3db
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55488
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
10 months agoLU-15378 tests: skip sanity test_64h for old servers 62/55462/2
Andreas Dilger [Mon, 8 Apr 2024 19:14:48 +0000 (13:14 -0600)]
LU-15378 tests: skip sanity test_64h for old servers

Running sanity test_64h fails intermittently with EXA5.2 servers,
skip it during interop since there are a number of fixes in this
area and EXA5 grant interop isn't super critical.

Lustre-change: https://review.whamcloud.com/54699
Lustre-commit: 351c3e4275025899c60d9aaed3687855479bf06b

Test-Parameters: trivial testlist=sanity env=ONLY=64 serverversion=EXA5
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I65d9e247aa62c02345c3cd0f9575e3e0ba1ff2ce
Reviewed-by: Sergey Cheremencev <scherementsev@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55462
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-17627 build: fix new mofed version 71/55371/3
Minh Diep [Wed, 6 Mar 2024 02:26:58 +0000 (18:26 -0800)]
LU-17627 build: fix new mofed version

Allow multi-digit MOFED version numbers.
Fix compare_version function to return what it should

Lustre-change: https://review.whamcloud.com/c/fs/lustre-release/+/54336
Lustre-commit: 0f7cdfe3f84a8b90d0546d989587f6ec703bd6a2

Test-Parameters: trivial
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Change-Id: I0f585cb355bb34270003ae1139688080c301186a
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55371
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
10 months agoLU-15977 test: improve sanityn 80b 78/55478/2
Lai Siyao [Sat, 25 May 2024 14:42:54 +0000 (10:42 -0400)]
LU-15977 test: improve sanityn 80b

Backport sanityn test_80b change from LU-15529.

Lustre-commit: 1d6b96a1cf0468bc81949960aa649cde8f927008
Lustre-change: https://review.whamcloud.com/40891

Test-Parameters: trivial mdtcount=4 mdscount=2 testlist=sanityn env=ONLY=80b,ONLY_REPEAT=20
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Ie3fb71ddc7de9df32baa45851c21522cc939d6bd
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55478
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-17675 tests: flush opencache in sanity-flr/61a 61/55461/2
Alex Zhuravlev [Mon, 15 Apr 2024 05:38:39 +0000 (08:38 +0300)]
LU-17675 tests: flush opencache in sanity-flr/61a

flush opencache to update MDS's atime with close RPC

Lustre-change: https://review.whamcloud.com/54788
Lustre-commit: 3e37a49ec072a65cfa76bbb242e125450bdc9676

Test-Parameters: trivial testlist=sanity-flr clientdistro=el9.3
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I5f4d3400b3f772553ee6004ac271a4aa644699e0
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55461
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
10 months agoLU-15887 test: add always_except() 37/55437/2
John L. Hammond [Wed, 25 May 2022 14:07:37 +0000 (09:07 -0500)]
LU-15887 test: add always_except()

In test-framework.sh, add a new function (always_except()) to replace
manual manipulation of $ALWAYS_EXECPT.
Add a line to contrib/scripts/spelling.txt to suggest its use.

Do not convert sanity.sh to use always_except() in backported patch
to avoid conflict with other patches currently in flight.

Lustre-change: https://review.whamcloud.com/47452
Lustre-commit: c4ff4aef7eb939d536acffaac4465039f3cfa935

Test-Parameters: trivial
Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: I1b39fe9555bab59e70db00cef73d13102668500a
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55437
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-10215 tests: remove disk2_4 disk2_5 images 44/55444/2
Andreas Dilger [Thu, 4 Apr 2024 03:36:24 +0000 (21:36 -0600)]
LU-10215 tests: remove disk2_4 disk2_5 images

Remove the old disk2_4-*.tar.bz2 and disk2_5-ldiskfs.tar.bz2
images from the Git repo.  The disk2_5 image was never included into
testing due to an oversight in Makefile.am, and adding it to testing
is unlikely to be of any practical value as these releases are both
more than 10 years old and very unlikely to have any users that would
actually want to upgrade their systems at this point.

Lustre-change: https://review.whamcloud.com/47281
Lustre-commit: add70ce9cdc3e2d6148dafeef587d12d0277744c

Test-Parameters: trivial
Test-Parameters: testlist=conf-sanity env=ONLY=32 mdscount=1 mdtcount=1
Test-Parameters: testlist=conf-sanity env=ONLY=32 mdscount=2 mdtcount=4
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I76a42eb90c3e1198d33783f3089ac30462429ac4
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55444
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
10 months agoLU-10465 test: fix sanity-pfl interop for 4M stripe size 40/55440/2
Andreas Dilger [Fri, 14 Jun 2024 23:24:36 +0000 (17:24 -0600)]
LU-10465 test: fix sanity-pfl interop for 4M stripe size

New servers could use 4MiB default stripe size, so some of
the tests need to use bigger component extent or specify stripe size
explicitly to accommodate enough stripe count.

Patch fixes sanity-pfl to take into account stripe size in some tests.

This change for test scripts comes from:

Lustre-change: https://review.whamcloud.com/37318
Lustre-commit: ea18d7da59d369f093e340e150544f51b2f229a1

Test-Parameters: trivial
Test-Parameters: testlist=sanity-pfl serverjob=lustre-master serverbuildno=4540 env=ONLY="1 14 19 20 24"
Fixes: ee7dfc5ad1 ("LU-17025 llapi: Verify stripe pool name")
Fixes: 0396310692 ("LU-15727 lod: honor append_pool with default composite layouts")
Fixes: b384ea39e5 ("LU-14480 pool: wrong usage with ost list")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I3cef8805247fc5253e0a0ac05157b9d6093ebbe5
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55440
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-10465 test: fix interop test for 4M default stripe size 51/54851/5
Andreas Dilger [Thu, 23 Jan 2020 20:15:10 +0000 (20:15 +0000)]
LU-10465 test: fix interop test for 4M default stripe size

New servers could use 4MiB default stripe size, so some of
the tests need to use bigger component extent or specify stripe size
explicitly to accommodate enough stripe count.

Patch includes several test fixes:
- sanity-flr: use bigger component size and amount of data to
              saturate all stripes as expected by test
- sanity-lfsck: 36[a-c] to use 1M stripe as expected by calcs
- sanity: 130g to use 1M stripe prior FIEMAP calcs

This change for test scripts comes from:

Lustre-change: https://review.whamcloud.com/37318
Lustre-commit: ea18d7da59d369f093e340e150544f51b2f229a1

Test-Parameters: trivial
Test-Parameters: testlist=sanity-flr serverjob=lustre-master serverbuildno=4540 env=ONLY="0 208"
Test-Parameters: testlist=sanity-lfsck serverjob=lustre-master serverbuildno=4540 env=ONLY="36"
Test-Parameters: testlist=sanity serverjob=lustre-master serverbuildno=4540 env=ONLY="130"
Fixes: ee7dfc5ad1 ("LU-17025 llapi: Verify stripe pool name")
Fixes: 0396310692 ("LU-15727 lod: honor append_pool with default composite layouts")
Fixes: b384ea39e5 ("LU-14480 pool: wrong usage with ost list")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I3cef8805247fc5253e0a0ac05157b9d609054df9
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54851
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
10 months agoLU-16084 tests: fix lustre-patched filefrag check 36/55436/2
Andreas Dilger [Wed, 10 Aug 2022 18:27:56 +0000 (12:27 -0600)]
LU-16084 tests: fix lustre-patched filefrag check

Fix sanity test_130b thru test_130g to check for "filefrag -l"
instead of "filefrag -e", since the "-e" option has been in
upstream e2fsprogs since commit v1.42.6-50-g2508eaa7.  The "-l"
option (logical extent ordering) is really what is needed to
handle Lustre-striped files anyway.

While there, fix the code style in these subtests:
- use "local" and lower-case names for local variables
- use $(...) for subshells
- use (( ... )) for numeric comparisons
- use preferred "check || action" style checks
- use "skip_env" for environment configuration checks (e2fsprogs)
- use "skip" for test-related checks that can't be "fixed"
- use pre-defined $ost1_FSTYPE for checking OST filesystem type

Lustre-change: https://review.whamcloud.com/48188
Lustre-commit: fef1db004c4230e1051f9266f34a658501bf5d03

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I8eb7f17a9532796ab0274247194dd52cbc8a141c
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55436
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-17050 tests: remove sanity-sec IDENTITY_UPCALL 38/55438/2
Andreas Dilger [Thu, 14 Sep 2023 20:59:58 +0000 (14:59 -0600)]
LU-17050 tests: remove sanity-sec IDENTITY_UPCALL

In sanity-sec.sh there was an IDENTITY_UPCALL variable that
conflicted with an identically-named global variable in
test-parameters.  Due to new checks by Gerrit Janitor, this
was causing any patch that ran sanity-sec.sh to log a warning.

Remove the parameter from sanity-sec.sh as it is unused.
Update code style in functions upcating identity_upcall.

Lustre-change: https://review.whamcloud.com/52400
Lustre-commit: 6a04acde6d0900f3e7fb3ed4929dc98d58bd194c

Test-Parameters: trivial testlist=sanity-sec mdscount=2 mdtcount=4
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I2f2ace33cf01153d16f4f25038065d33443ebbe5

Change-Id: I2bda5557d62d41f170eda55e66cdb8add28c85c7
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55438
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-16673 tests: add checking for user USER0 33/55433/2
Xinliang Liu [Mon, 3 Apr 2023 03:47:23 +0000 (03:47 +0000)]
LU-16673 tests: add checking for user USER0

Add checking for user USER0 in tests 125, 154a, 154b.

Lustre-change: https://review.whamcloud.com/50501
Lustre-commit: b7b55534b9b5d4d58025082ae5403938853b168c

Test-Parameters: trivial testlist=sanity env=ONLY="125 154a 154b"
Change-Id: Id42d4b6dca4c05757d02483ddedd65be55df96d6
Signed-off-by: Xinliang Liu <xinliang.liu@linaro.org>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55433
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-15259 tests: use existing usernames for setfacl 34/55434/3
Andreas Dilger [Fri, 14 Jun 2024 16:48:29 +0000 (12:48 -0400)]
LU-15259 tests: use existing usernames for setfacl

In SLES15.2 and Ubutntu 20 the "bin" and "daemon" users are not
defined in /etc/passwd, causing setfacl to print a cryptic error:

  setfacl -m u:bin:rw f -- failed
  ~     ? setfacl: Option -m: Invalid argument near character 3

Replace "bin" and "daemon" in ACL tests so they are run with user
and group names that exist on all distros currently being tested.
They can also be specified via ACLUSR1/ACLUSR2 in the test config.

The "permission_xattr" test also needs "nobody" user and group.

Also, the "getfacl" command prints users and groups in numerical
order, so the ACL tests will fail if "daemon" < "bin", or if either
group is higher than the "users" group.  Fix them as needed.

Lustre-change: https://review.whamcloud.com/45627
Lustre-commit: 60188994e24b95db5915b8e6802f7963ffb2fd9c

Test-Parameters: trivial testlist=sanity-quota,sanity-sec,pjdfstest
Test-Parameters: testlist=sanity env=ONLY=103-154 clientdistro=el8.9 serverdistro=el8.9
Test-Parameters: testlist=sanity env=ONLY=103-154,SANITY_EXCEPT=130,HONOR_EXCEPT=y clientdistro=el9.3
Test-Parameters: testlist=sanity env=ONLY=103-154 clientdistro=sles15sp4
Test-Parameters: testlist=sanity env=ONLY=103-154 clientdistro=ubuntu2204
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I7003e95577ab3a9314e8d4d29bb6b1784b9f8ae7
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55434
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
10 months agoNew RC 2.15.5-RC2 2.15.5-RC2 v2_15_5-RC2
Oleg Drokin [Sun, 16 Jun 2024 03:36:50 +0000 (23:36 -0400)]
New RC 2.15.5-RC2

Change-Id: I3a923311676177058db71841ad7e41bb1e376953
Signed-off-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-17146 tests: avoid sanity-lfsck/test_38 failure 25/55425/3
Bruno Faccini [Tue, 26 Sep 2023 09:15:35 +0000 (11:15 +0200)]
LU-17146 tests: avoid sanity-lfsck/test_38 failure

This regression has been introduced in kernels after commit
v5.11-10234-gcbd59c48ae2b (5.12), and is fixed with
commit v6.2-rc4-61-g5956592ce337 (6.2).
The issue has been introduced by upstream
commit 8c8387ee3f55
("mm: stop filemap_read() from grabbing a superfluous page").
Skip sanity-lfsck/test_38 for this range of kernels.

Lustre-change: https://review.whamcloud.com/52537
Lustre-commit: 8ecbd1b5085fac6463889146290ef56df2710eeb

Test-Parameters: trivial testlist=sanity-lfsck
Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Change-Id: Ic6066e43959c913c2f225d229927803471f06cee
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55425
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Bruno Faccini <bfaccini@nvidia.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-14291 tests: make module loading of ost optional 58/54258/2
James Simmons [Wed, 14 Feb 2024 12:38:25 +0000 (07:38 -0500)]
LU-14291 tests: make module loading of ost optional

Future Lustre versions will no longer have an ost kernel module.
load_module in the test framework will failure so capture the
failure to ignore it. We will need this for interop testing.

Lustre-change: https://review.whamcloud.com/54040
Lustre-commit: ef7deb7b076e554279f88f6d57afa17884027f9a

Change-Id: Iedff4f6a36ceffa9428e3f891db78b7538217085
Test-Parameters: trivial
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54258
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
10 months agoLU-16623 tests: interop sanity-flr/202 sanity-pfl/15 28/53628/8
Andreas Dilger [Mon, 13 May 2024 18:09:55 +0000 (11:09 -0700)]
LU-16623 tests: interop sanity-flr/202 sanity-pfl/15

Fix interop testing with sanity-flr test_202 and sanity-pfl test_15
to request a specific number of stripes instead of "-c -1", which
may not always allocate objects on all OSTs if they are full/busy.

Test-Parameters: trivial
Test-Parameters: testlist=sanity-flr env=ONLY=202 serverjob=lustre-master serverbuildno=4527
Test-Parameters: testlist=sanity-pfl env=ONLY=15 serverjob=lustre-master serverbuildno=4527

Fixes: ced540165e ("LU-16623 lod: handle object allocation consistently")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I1df9324d9d978e9253f7d4a433d85934c33ebbe5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53628
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Alex Deiter
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-17110 llite: fix slab corruption with fm_extent_count=0 12/52512/4
Etienne AUJAMES [Tue, 12 Sep 2023 16:06:25 +0000 (18:06 +0200)]
LU-17110 llite: fix slab corruption with fm_extent_count=0

If userspace uses fiemap with .fm_extent_count=0, .fm_extents[0] is
not allocated. Writing on the first entry without checking the extent
count could lead to memory corruption (slab).

This patch fix also the case when osc is disable: FIEMAP_EXTENT_LAST
should be set on the extent (fe_flags) and not on the fiemap struct.

Add a regression test sanityn 71d to test fiemap with
fm_extent_count=0.
Add a regression test sanity-hsm 408 to test fiemap on release files.

Lustre-change: https://review.whamcloud.com/52352
Lustre-commit: a81dc7d0e158894e905ab3d309f7b92864a94378

Fixes: 4097196 ("LU-11848 lov: FIEMAP support for PFL and FLR file")
Test-Parameters:testlist=sanityn
Test-Parameters:testlist=sanityn env=ONLY=71d,ONLY_REPEAT=20
Test-Parameters:testlist=sanity-hsm env=ONLY=408,ONLY_REPEAT=20
Signed-off-by: Etienne AUJAMES <etienne.aujames@cea.fr>
Change-Id: Id63c6973540187e678020977f2d555dfcbf3c634
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52512
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
10 months agoLU-17013 lov: fill FIEMAP_EXTENT_LAST flag 94/52494/5
Lei Feng [Thu, 3 Aug 2023 09:44:15 +0000 (17:44 +0800)]
LU-17013 lov: fill FIEMAP_EXTENT_LAST flag

If file has N extents and get the fiemap with exactly N
extent slots, the last extent will miss FIEMAP_EXTENT_LAST
flag. Fix it.

Lustre-change: https://review.whamcloud.com/51863
Lustre-commit: b1739ba3fadcc7cf0f330f6984bd51d3d801247d

Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: testlist=sanityn env=ONLY="71a 71b 71c"
Change-Id: I4556b31f0d04bdf8e83f323e83b871b093beaa5e
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Zhenyu Xu <bobijam@hotmail.com>
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52494
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-16480 lov: fiemap improperly handles fm_extent_count=0 08/52308/2
Andrew Perepechko [Mon, 16 Jan 2023 13:13:34 +0000 (08:13 -0500)]
LU-16480 lov: fiemap improperly handles fm_extent_count=0

FIEMAP calls with fm_extent_count=0 are supposed only to
return the number of extents.

lov_object_fiemap() attempts to initialize stripe_last
based on fiemap->fm_extents[0] which is not initialized
in userspace and not even allocated in kernelspace.

Eventually, the call exits with -EINVAL and "FIEMAP does
not init start entry" kernel log message.

Lustre-change: https://review.whamcloud.com/49645
Lustre-commit: 829af7b029d8e4e391b93792bf5214611b0193bd

Fixes: 409719608c ("LU-11848 lov: FIEMAP support for PFL and FLR file")
Signed-off-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Change-Id: I65e706b5dd5c8a6db90a539c2602af839b4da823
HPE-bug-id: LUS-11443
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52308
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-17883 kernel: update SLES15 SP5 [5.14.21-150500.55.65.1] 28/55228/4
Jian Yu [Tue, 28 May 2024 23:32:47 +0000 (16:32 -0700)]
LU-17883 kernel: update SLES15 SP5 [5.14.21-150500.55.65.1]

Update SLES15 SP5 kernel to 5.14.21-150500.55.65.1 for Lustre client.

Lustre-change: https://review.whamcloud.com/55227
Lustre-commit: 530f49f628bad9b8bb3c2a87a79009b735998938

Test-Parameters: trivial mdtcount=4 mdscount=2 \
  clientdistro=sles15sp5 testlist=sanity

Test-Parameters: optional clientdistro=sles15sp5 testgroup=full-part-1
Test-Parameters: optional clientdistro=sles15sp5 testgroup=full-part-2
Test-Parameters: optional clientdistro=sles15sp5 testgroup=full-part-3

Change-Id: Ie0601c190e52d6192bf389338be51c77db03a9c2
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55228
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
10 months agoLU-17404 kernel: update RHEL 9.4 [5.14.0-427.20.1.el9_4] 66/55366/3
Jian Yu [Wed, 12 Jun 2024 17:13:40 +0000 (10:13 -0700)]
LU-17404 kernel: update RHEL 9.4 [5.14.0-427.20.1.el9_4]

Update RHEL 9.4 kernel to 5.14.0-427.20.1.el9_4 for Lustre client.

Lustre-change: https://review.whamcloud.com/54712
Lustre-commit: TBD (from 000ac2084bef80ed9b0610245ab7552f678d3e39)

Test-Parameters: trivial \
  mdtcount=4 mdscount=2 clientdistro=el9.4 testlist=sanity
Test-Parameters: optional clientdistro=el9.4 testgroup=full-part-1
Test-Parameters: optional clientdistro=el9.4 testgroup=full-part-2
Test-Parameters: optional clientdistro=el9.4 testgroup=full-part-3

Change-Id: Ieee88a5a9f8e58f8445e126d21e45228e7b5ca64
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55366
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
10 months agoLU-17942 kernel: update RHEL 8.10 [4.18.0-553.5.1.el8_10] 11/55411/2
Jian Yu [Wed, 12 Jun 2024 17:08:24 +0000 (10:08 -0700)]
LU-17942 kernel: update RHEL 8.10 [4.18.0-553.5.1.el8_10]

Update RHEL 8.10 kernel to 4.18.0-553.5.1.el8_10.

Lustre-change: https://review.whamcloud.com/55410
Lustre-commit: TBD (from 2cc06472d975dfe224d09bac0fd54316a721a122)

Test-Parameters: trivial fstype=ldiskfs mdtcount=4 mdscount=2 \
  clientdistro=el8.10 serverdistro=el8.9 testlist=sanity

Test-Parameters: trivial fstype=zfs mdtcount=4 mdscount=2 \
  clientdistro=el8.10 serverdistro=el8.9 testlist=sanity

Test-Parameters: trivial fstype=ldiskfs mdtcount=4 mdscount=2 \
  clientdistro=el8.9 serverdistro=el8.10 testlist=sanity

Test-Parameters: trivial fstype=zfs mdtcount=4 mdscount=2 \
  clientdistro=el8.9 serverdistro=el8.10 testlist=sanity

Test-Parameters: optional clientdistro=el8.10 serverdistro=el8.10 \
  testgroup=full-part-1

Test-Parameters: optional clientdistro=el8.10 serverdistro=el8.10 \
  testgroup=full-part-2

Test-Parameters: optional clientdistro=el8.10 serverdistro=el8.10 \
  testgroup=full-part-3

Change-Id: I9e36b42c56f0d5e45077350d0afc32f207e3d8b7
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55411
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-17197 obdclass: preserve fairness when waiting for rpc slot 32/53232/3
Shaun Tancheff [Wed, 18 Oct 2023 03:54:59 +0000 (22:54 -0500)]
LU-17197 obdclass: preserve fairness when waiting for rpc slot

When obd_get_mod_rpc_slot() waits for an available slot it places the
waiting thread at the HEAD of the queue, so it will be woken before
anything else that is already queued.  This is clearly unfair and can
hurt performance.

So change to always add to the tail to ensure a FIFO ordering (except
that CLOSE might sometimes be woken a bit early).

This regression was introduced in a rewrite that was supposed to make
waiting more fair - by avoiding a broadcast wakeup for "close"
requests.

Also fix some stale comments and expose __add_wait_queue_entry_tail

Running mdtest with the patch applied shows about a 3% improvement:

                             master            patched
  mdtest-easy-write      350.585906 kIOPS   353.783545 kIOPS
   mdtest-easy-stat     1320.329353 kIOPS  1408.320419 kIOPS
 mdtest-easy-delete      285.084103 kIOPS   289.625900 kIOPS
            [SCORE]      509.115803 kiops   524.516113 kiops

Lustre-change: https://review.whamcloud.com/52738
Lustre-commit: b5fde4d6c02324a8511afe30d02eb2cf46ea799d

Fixes: 5243630b09d2 ("LU-15947 obdclass: improve precision of wakeups for mod_rpcs")
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: If767c4299bcbab71589b0f3c01e85bf461686ca5
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53232
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
10 months agoLU-16633 obdclass: fix rpc slot leakage 39/51539/3
Alex Zhuravlev [Fri, 10 Mar 2023 17:47:05 +0000 (20:47 +0300)]
LU-16633 obdclass: fix rpc slot leakage

obd_get_mod_rpc_slot() can race with obd_put_mod_rpc_slot():
finishing wait_woken() resets WQ_FLAG_WOKEN (which is set
when the corresponding thread gets a slot incrementing
cl_mod_rpcs_in_flight. then another thread execting
__wake_up_locked_key() may find that wq_entry again and call
claim_mod_rpc_function() one more time again incrementing
cl_mod_rpc_in_flight. thus it's incremented twice for a
single obd_get_mod_rpc_slot().

    #1: obd_get_mod_rpc_slot() #2: obd_put_mod_rpc_slot()
    flags &= ~WQ_FLAG_WOKEN
    list_add()
    wait_woken()
schedule claim_mod_rpc_function()
cl_mod_rpcs_in_flight++
wake_up()

flags &= ~WQ_FLAG_WOKEN

#3: obd_put_mod_rpc_slot()
claim_mod_rpc_function()
cl_mod_rpcs_in_flight++
wake_up()
    list_del()

the patch introduces a replacement for WQ_FLAG_WOKEN which is never
reset once set.

Lustre-change: https://review.whamcloud.com/50261
Lustre-commit: 91a3726f313df33e099320d171039f8371fec27f

Fixes: 5243630b09 ("LU-15947 obdclass: improve precision of wakeups for mod_rpcs")
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I29371c8c85414413c5a8e41dec3632f64ad127bb
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51539
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
10 months agoLU-15595 tests: Router test interop check and aarch fix 46/51546/3
Chris Horn [Wed, 14 Sep 2022 01:23:37 +0000 (20:23 -0500)]
LU-15595 tests: Router test interop check and aarch fix

setup_router_test() executes load_lnet() on remote nodes, but
this function was only added in 2.15. Add a version check for it.

Enabling routing may fail on nodes with small amount of memory (like
aarch config). Define small number of router buffers to work around
this issue. Modify the functions which calculate the number of buffers
to allow small sizes to be specified via parameters.

Lustre-change: https://review.whamcloud.com/48578
Lustre-commit: 1aba6b0d9b661d3699cbd4624e9db334a13fc647

Test-Parameters: trivial testlist=sanity-lnet serverversion=2.12.9
Test-Parameters: testgroup=review-ldiskfs-arm testlist=sanity-lnet
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: If0b76747fe09e883546f18da9f3322c72263e29d
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51546
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
11 months agoNew tag 2.15.5-RC1 2.15.5-RC1 v2_15_5-RC1
Oleg Drokin [Thu, 30 May 2024 22:10:17 +0000 (18:10 -0400)]
New tag 2.15.5-RC1

Change-Id: I2d863fb40f635e5193660d03b3a5f57ae694f336
Signed-off-by: Oleg Drokin <green@whamcloud.com>
11 months agoLU-17402 kernel: RHEL 8.10 client and server support 93/55193/4
Jian Yu [Tue, 28 May 2024 23:45:13 +0000 (16:45 -0700)]
LU-17402 kernel: RHEL 8.10 client and server support

This patch makes changes to support RHEL 8.10 release
with kernel 4.18.0-553.el8_10 for Lustre client and server.

Lustre-change: https://review.whamcloud.com/54800
Lustre-commit: TBD (from 6748f47fac79e557ae21eb790b597be6449c926a)

Test-Parameters: trivial fstype=ldiskfs mdtcount=4 mdscount=2 \
  clientdistro=el8.10 serverdistro=el8.9 testlist=sanity

Test-Parameters: trivial fstype=zfs mdtcount=4 mdscount=2 \
  clientdistro=el8.10 serverdistro=el8.9 testlist=sanity

Test-Parameters: trivial fstype=ldiskfs mdtcount=4 mdscount=2 \
  clientdistro=el8.9 serverdistro=el8.10 testlist=sanity

Test-Parameters: trivial fstype=zfs mdtcount=4 mdscount=2 \
  clientdistro=el8.9 serverdistro=el8.10 testlist=sanity

Test-Parameters: optional clientdistro=el8.10 serverdistro=el8.10 \
  testgroup=full-part-1

Test-Parameters: optional clientdistro=el8.10 serverdistro=el8.10 \
  testgroup=full-part-2

Test-Parameters: optional clientdistro=el8.10 serverdistro=el8.10 \
  testgroup=full-part-3

Change-Id: I0a9a262d13e0b0de3607da0982468fd8b5f6a7aa
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55193
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 months agoLU-17404 kernel: update RHEL 9.4 [5.14.0-427.18.1.el9_4] 04/55204/3
Jian Yu [Tue, 28 May 2024 23:42:16 +0000 (16:42 -0700)]
LU-17404 kernel: update RHEL 9.4 [5.14.0-427.18.1.el9_4]

Update RHEL 9.4 kernel to 5.14.0-427.18.1.el9_4 for Lustre client.

Lustre-change: https://review.whamcloud.com/55203
Lustre-commit: TBD (from 07a23833999207c336532bcf75aa9d5a954f1b07)

Test-Parameters: trivial \
  mdtcount=4 mdscount=2 clientdistro=el9.4 testlist=sanity
Test-Parameters: optional clientdistro=el9.4 testgroup=full-part-1
Test-Parameters: optional clientdistro=el9.4 testgroup=full-part-2
Test-Parameters: optional clientdistro=el9.4 testgroup=full-part-3

Change-Id: If18027650ff953733f2e57727b71d2daa61d249c
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55204
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 months agoLU-17749 kernel: update RHEL 8.9 [4.18.0-513.24.1.el8_9] 57/55157/2
Jian Yu [Mon, 20 May 2024 19:52:09 +0000 (12:52 -0700)]
LU-17749 kernel: update RHEL 8.9 [4.18.0-513.24.1.el8_9]

Update RHEL 8.9 kernel to 4.18.0-513.24.1.el8_9.

Lustre-change: https://review.whamcloud.com/54821
Lustre-commit: TBD (from 23a99efd9104b328ce1edb5fc9094bce2c06e9b9)

Test-Parameters: trivial fstype=ldiskfs mdtcount=4 mdscount=2 \
  clientdistro=el8.9 serverdistro=el8.8 testlist=sanity

Test-Parameters: trivial fstype=zfs mdtcount=4 mdscount=2 \
  clientdistro=el8.9 serverdistro=el8.8 testlist=sanity

Test-Parameters: trivial fstype=ldiskfs mdtcount=4 mdscount=2 \
  clientdistro=el8.8 serverdistro=el8.9 testlist=sanity

Test-Parameters: trivial fstype=zfs mdtcount=4 mdscount=2 \
  clientdistro=el8.8 serverdistro=el8.9 testlist=sanity

Test-Parameters: optional clientdistro=el8.9 serverdistro=el8.9 \
  testgroup=full-part-1

Test-Parameters: optional clientdistro=el8.9 serverdistro=el8.9 \
  testgroup=full-part-2

Test-Parameters: optional clientdistro=el8.9 serverdistro=el8.9 \
  testgroup=full-part-3

Change-Id: I94b5a95e9e85f2f5e0cddb1dbb519ef92520ad0b
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55157
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 months agoLU-17641 kernel: update RHEL 9.3 [5.14.0-362.24.1.el9_3] 37/55137/2
Jian Yu [Fri, 17 May 2024 07:18:21 +0000 (00:18 -0700)]
LU-17641 kernel: update RHEL 9.3 [5.14.0-362.24.1.el9_3]

Update RHEL 9.3 kernel to 5.14.0-362.24.1.el9_3 for Lustre client.

Lustre-change: https://review.whamcloud.com/54820
Lustre-commit: TBD (from 3448e237b53797044e5b25544667a31ac761a9e9)

Test-Parameters: trivial fstype=ldiskfs mdtcount=4 mdscount=2 \
  clientdistro=el9.3 testlist=sanity

Test-Parameters: trivial fstype=zfs mdtcount=4 mdscount=2 \
  clientdistro=el9.3 testlist=sanity

Test-Parameters: optional clientdistro=el9.3 testgroup=full-part-1
Test-Parameters: optional clientdistro=el9.3 testgroup=full-part-2
Test-Parameters: optional clientdistro=el9.3 testgroup=full-part-3

Change-Id: Ifafb3fbbfdfcd82506daed44d3601a0d4357331e
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55137
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Alex Deiter
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 months agoLU-17404 kernel: new kernel [RHEL 9.4 5.14.0-427.16.1.el9_4] 17/54717/5
Jian Yu [Sat, 11 May 2024 07:27:49 +0000 (00:27 -0700)]
LU-17404 kernel: new kernel [RHEL 9.4 5.14.0-427.16.1.el9_4]

This patch makes changes to support new RHEL 9.4 release
for Lustre client.

Lustre-change: https://review.whamcloud.com/54712
Lustre-commit: TBD (from 177846a0aa58b35d43696b3c3c5d71df0109ab14)

Test-Parameters: trivial \
  mdtcount=4 mdscount=2 clientdistro=el9.4 testlist=sanity
Test-Parameters: optional clientdistro=el9.4 testgroup=full-part-1
Test-Parameters: optional clientdistro=el9.4 testgroup=full-part-2
Test-Parameters: optional clientdistro=el9.4 testgroup=full-part-3

Change-Id: Ic292c01ad16dc06e8dee966c4a211896fea284c0
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54717
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 months agoLU-17850 build: prefer LINUXRELEASE over uname -r 15/55115/3
Jian Yu [Wed, 15 May 2024 04:52:41 +0000 (21:52 -0700)]
LU-17850 build: prefer LINUXRELEASE over uname -r

In a container or chroot environment, "uname -r" reports
the host instead of the target kernel version. We should
use the LINUXRELEASE variable which is configured in
config/lustre-build-linux.m4 with the value from UTS_RELEASE.

Lustre-change: https://review.whamcloud.com/55108
Lustre-commit: 0c46ba62efb35b31bb826e5898ffa6e52768e7fa

Test-Parameters: trivial
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Change-Id: Iaa48027f5ae873e1298695a264db1c351d9eac5c
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55115
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
11 months agoLU-16819 build: use mofed path based on target kernel 17/55117/4
Ake Sandgren [Wed, 15 May 2024 05:09:13 +0000 (22:09 -0700)]
LU-16819 build: use mofed path based on target kernel

Instead of using "uname -r", which limits builds to the currently
running kernel, use the target kernel which is available in
LINUXRELEASE, if the directory is available.
Building for a specific kernel is common practice when using DKMS.

Lustre-change: https://review.whamcloud.com/50937
Lustre-commit: 0e9708016b9948676484d290326c1fe8a269eb80
Test-Parameters: trivial
Signed-off-by: Ake Sandgren <ake.sandgren@hpc2n.umu.se>
Change-Id: Ifce912061a74fc5b7435cd940105190f0c3cd544
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55117
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
11 months agoLU-17034 quota: tmp fix against memory corruption 35/55035/2
Sergey Cheremencev [Mon, 8 Apr 2024 11:43:53 +0000 (14:43 +0300)]
LU-17034 quota: tmp fix against memory corruption

Change QMT_INIT_SLV_CNT from 64 to 2000 to avoid accessing
memory out of array lqeg_arr. It could happen when at least
one of OSTs has index larger than the whole number of OSTs.
It is a temporary solution and maximum supported OST index
is 0x7d0. Later it will be changed with the longterm
solution.

Signed-off-by: Sergey Cheremencev <scherementsev@ddn.com>
Change-Id: I8d9444017fa9847142f3df77c63368282ff134c4
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55035
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Stephane Thiell <sthiell@stanford.edu>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 months agoLU-15163 osd: osd_obj_map_recover() to restart transaction 25/55125/2
Alex Zhuravlev [Tue, 26 Oct 2021 08:38:50 +0000 (11:38 +0300)]
LU-15163 osd: osd_obj_map_recover() to restart transaction

osd_obj_map_recover() stops transaction when need to call
vfs_link() and it has to start a new transaction to modify
filesystem.

Lustre-commit: 7bf0e557a2b3a463e4d78e81b6ab93987d3dc8af
Lustre-change: https://review.whamcloud.com/45368

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I6efe5444ddc959b19092bebc6e3c7dc25a29cea1
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55125
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 months agoLU-16831 lfs: limit stripe count for component size 75/53775/6
Bobi Jam [Mon, 13 May 2024 18:06:37 +0000 (11:06 -0700)]
LU-16831 lfs: limit stripe count for component size

If stripe count is larger than component_size/stripe_size, some
allocated OST objects are created but inaccessible. This patch
reduces the number of stripes in that case to avoid this.

Lustre-change: https://review.whamcloud.com/51143
Lustre-commit: a250ecb959a98c2ec0a01bbca9d943a19b8fa078

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I117ed8a7696c6c6adcdd0c2c6531a958cc53bd51
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53775
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 months agoLU-16297 ptlrpc: don't panic during reconnection 40/55040/2
Alexander Boyko [Thu, 3 Nov 2022 11:23:20 +0000 (07:23 -0400)]
LU-16297 ptlrpc: don't panic during reconnection

ptlrpc_send_rpc() could race with ptlrpc_connect_import_locked()
in the middle of assertion check and this leads to a wrong panic.
Assertion checks

(AT_OFF || imp->imp_state != LUSTRE_IMP_FULL ||

reconnect changes import state and flags
and second part

(imp->imp_msghdr_flags & MSGHDR_AT_SUPPORT) ||
!(imp->imp_connect_data.ocd_connect_flags & OBD_CONNECT_AT)))

MSGHDR_AT_SUPPORT is disabled during client reconnection.
It is not good to use locking at this hot part, so fix changes
assertion to a report.

Lustre-change: https://review.whamcloud.com/49029
Lustre-commit: df31c4c0b39b8845911344e6fadc008bcba40bb1

HPE-bug-id: LUS-10985
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: Ifc9e413c679c3e8a4c8f4f541251bebabae41c82
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55040
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 months agoLU-16623 tests: ignore sanity-pfl stripe-count off-by-1 83/55083/2
Andreas Dilger [Sun, 14 Apr 2024 05:54:24 +0000 (23:54 -0600)]
LU-16623 tests: ignore sanity-pfl stripe-count off-by-1

In some cases the MDS may not create all stripes on a file, if the
MDT-OST connection does not have precreated objects.  This is OK,
so the tests should not fail the stripe-count check if trying to
create a fully-striped file and one of the stripes is missing.

Lustre-change: https://review.whamcloud.com/54778
Lustre-commit: e715a8c2a616f6d4158decfed5dec2fa444f0c67

Test-Parameters: trivial testlist=sanity-flr
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ie482fdf86f82e7a2292c021761885249a6c551f1
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55083
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 months agoLU-16515 tests: disable sanity test_118c/118d 77/55077/2
Andreas Dilger [Sat, 11 May 2024 07:20:43 +0000 (00:20 -0700)]
LU-16515 tests: disable sanity test_118c/118d

Temporarily disable sanity test_118c and test_118d until there is
a fix available, since this is failing a large fraction of tests.

Lustre-change: https://review.whamcloud.com/50470
Lustre-commit: 7c52cbf65218d77c0594f92981173aa7d78f6758

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I16ebbc470a126bb99b5c3ecdf93407d6b73ebbe5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55077
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 months agoLU-17362 build: Update ZFS version to 2.1.15 91/54791/3
Jian Yu [Fri, 10 May 2024 00:12:44 +0000 (17:12 -0700)]
LU-17362 build: Update ZFS version to 2.1.15

Update ZFS version to 2.1.15. The changes are listed in:
https://github.com/openzfs/zfs/releases/tag/zfs-2.1.15

Lustre-change: https://review.whamcloud.com/54769
Lustre-commit: 01103eba35e88638d8860457fbdf89b101d4ab67

Test-Parameters: trivial fstype=zfs mdtcount=4 mdscount=2 \
  clientdistro=el8.9 serverdistro=el8.9 testlist=sanity

Test-Parameters: optional fstype=zfs mdtcount=4 mdscount=2 \
  testgroup=full-dne-zfs-part-1

Test-Parameters: optional fstype=zfs mdtcount=4 mdscount=2 \
  testgroup=full-dne-zfs-part-2

Test-Parameters: optional fstype=zfs mdtcount=4 mdscount=2 \
  testgroup=full-dne-zfs-part-3

Change-Id: I51532dbf9dbcadf64bb9dbd3b10e88d0cab38ffd
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54791
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Oleg Drokin <green@whamcloud.com>