Whamcloud - gitweb
fs/lustre-release.git
3 months agoLU-18057 mdt: don't include DOM bit to stripes lock 28/55828/3
Mikhail Pershin [Mon, 22 Jul 2024 08:50:06 +0000 (11:50 +0300)]
LU-18057 mdt: don't include DOM bit to stripes lock

Exclude DOM bit from inodebits mask used to restriping.
That might cause assertions further in inodebits code if
conflicts with GROUP lock. This bit is not needed anyway
but is taken just as part of MDS_INODELOCK_FULL used there

Patch uses mask MDS_RESTRIPE_ELC with excluded DOM bit for
restriping and prohibits further attempts to combine DOM
lock with other ibits mandatory. Note, that is restriction
only for local MDT locks as they are blocking locks.
In all such cases trybits to be used either for DoM bit or
for others

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I2fed05caf60aaa17a0d91ecf7b72df2b4ff95141
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55828
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-18054 build: Debian 12 with module-assistant 0.11.11 17/55817/2
Shaun Tancheff [Sat, 20 Jul 2024 04:46:23 +0000 (11:46 +0700)]
LU-18054 build: Debian 12 with module-assistant 0.11.11

Building with module-assistant 0.11.11 fails in module-assistant
prep-deb-files generic.make rule:

   The required compiler '<...gcc-12 make...>' is not installed, \
   won't continue!
   Set RELAX_CC_CHECK variable to skip plausibility checks.

Suppress the auto detection by explicitly setting
CC and RELAX_CC_CHECK when running m-a

HPE-bug-id: LUS-12443
Test-Parameters: trivial
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Icb2330884b542ba2a1a36b0318a3551c14c8ea09
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55817
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Caleb Carlson <caleb.carlson@hpe.com>
Reviewed-by: Shuichi Ihara <sihara@ddn.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-16350 ldiskfs: Server support for linux v6.10 29/55729/3
Shaun Tancheff [Tue, 23 Jul 2024 02:09:42 +0000 (09:09 +0700)]
LU-16350 ldiskfs: Server support for linux v6.10

Updated patch series for Linux v6.10:
   ext4-corrupted-inode-block-bitmaps-handling-patches.patch
   ext4-delayed-iput.patch
   ext4-filename-encode.patch
   ext4-max-dir-size.patch
   ext4-mballoc-extra-checks.patch
   ext4-misc.patch
   ext4-prealloc.patch

The same updates applies for Ubuntu 6.10.0 kernel

Test-Parameters: trivial
HPE-bug-id: LUS-11376
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I456ec723f04aaf57cb64965cc9d53fbea23a8c27
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55729
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: Shuichi Ihara <sihara@ddn.com>
Reviewed-by: Shuichi Ihara <sihara@ddn.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-18034 build: compatibility updates for kernel 6.10 28/55728/3
Shaun Tancheff [Mon, 15 Jul 2024 01:57:04 +0000 (08:57 +0700)]
LU-18034 build: compatibility updates for kernel 6.10

Braces required around trivial conditional statements that devolve
to a single semi-colon (;).

static modifier should be the first modifier used.

Use 'static inline' or 'inline' modifier at the front
of declaration if inline is to be used.

const int func(), const is discarded and should not be used.

Test-Parameters: trivial
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Iff67ff589e3ffde522c3b5bc03b1ec6705c6fc5c
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55728
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-17057 tests: check OSCs FULL state when setting GSS flvr 28/55628/4
Sebastien Buisson [Tue, 2 Jul 2024 13:12:25 +0000 (15:12 +0200)]
LU-17057 tests: check OSCs FULL state when setting GSS flvr

When setting a GSS flavor, make sure all OSCs are in FULL state, so
that clients refresh their connections with the updated flavor.

Test-Parameters: trivial
Test-Parameters: testgroup=review-dne-selinux-ssk-part-1
Test-Parameters: testgroup=review-dne-selinux-ssk-part-2
Test-Parameters: kerberos=true testlist=sanity-krb5
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I7c43856e7951f23f2299b25e133fea72400daf94
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55628
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-15737 ofd: don't block destroys 98/55598/5
Alexander Boyko [Sat, 15 Jun 2024 10:04:34 +0000 (06:04 -0400)]
LU-15737 ofd: don't block destroys

ofd_destoy_by_fid could sleep infinite for a GROUP lock
conflict. If all MDT osp_sync_inflight is spend for such destroys,
MDT would not be able to send destroys and setattr. And as a result
OST free space leakage.

This fix makes ldlm_cli_enqueue_local nonblocking for group locks,
and adds MDT repeat part of sync requests with errors.
Also patch adds a debugfs file to check hanged osp jobs.
lctl get_param osp.lustre-OST0000-osc-MDT0000.error_list

Adds recovery-small 160. It reproduces a situation when
MDT sends object destroys and it hangs at OST side,
because of conflicting GROUP lock.

Lustre: ll_ost02_068: service thread pid 51278 was inactive for
204.776 seconds. The thread might be hung...
Call Trace TBD:
ldlm_completion_ast+0x7ac/0x900 [ptlrpc]
ldlm_cli_enqueue_local+0x307/0x880 [ptlrpc]
ofd_destroy_by_fid+0x235/0x4a0 [ofd]
ofd_destroy_hdl+0x263/0xa10 [ofd]
tgt_request_handle+0xcc9/0x1a20 [ptlrpc]
ptlrpc_server_handle_request+0x23f/0xc60 [ptlrpc]
ptlrpc_main+0xc8b/0x15d0 [ptlrpc]

HPE-bug-id: LUS-12350
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: I396bf48d3d29f058f65095cbb4dbba11581534cc
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55598
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andriy Skulysh <andriy.skulysh@hpe.com>
Reviewed-by: Vitaly Fertman <vitaly.fertman@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-17000 utils: interger overflow fixes 88/55588/6
Shaun Tancheff [Tue, 2 Jul 2024 07:07:46 +0000 (14:07 +0700)]
LU-17000 utils: interger overflow fixes

CoverityID: 426402 ("Logical vs. bitwise operator")
LIBCFS_ALLOC_PRE()
  Add extra parenthesis to clarify && vs & precedence

CoverityID: 429655 ("Overflowed integer argument")
LIBCFS_FREE()
  Use size_t to avoid integer overflow

CoverityID: 429629 ("Overflowed integer argument")
jobid_interpret_string()
  Prevent joblen from becoming negative, truncate if necessary.

CoverityID: 429557 ("Overflowed constant")
ll_stats_pid_write()
   if len is 0 prevent stack corruption via kernbuf

CoverityID: 429646 ("Overflowed integer argument")
llog_pack_buffer()
  ssize_t read(): prevent int overflow if read() returns > INT_MAX

CoverityID: 429630 ("Overflowed integer argument")
readline() in cacheio.c
  ssize_t read(): prevent int overflow if read() returns > INT_MAX

CoverityID: 429624 ("Overflowed integer argument")
osd_read()
  passes loff_t size to osd_ldiskfs_readlink, update
  osd_ldiskfs_readlink to accept size_t length to avoid a
  theoretical overflow

Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Ica8a5e1ce58e540016e4bc101763f835eed2c2f7
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55588
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-17989 tests: Cleanup fake interface usage 65/55465/6
Chris Horn [Mon, 17 Jun 2024 17:58:41 +0000 (11:58 -0600)]
LU-17989 tests: Cleanup fake interface usage

Callers of have_interface() are using it to check that setup_fakeif()
completed successfully. Update setup_fakeif() to perform its own
validation. Both IPv4 and IPv6 addresses should be assigned to the
fake interface, so setup_fakeif() now checks for both of these.

Test cases that use the non-default namespace should call
setup_netns()/cleanup_netns() themselves rather than relying on the
test suite to setup this environment.

Test-Parameters: trivial
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I80451c55b8b7b6919b7d7e94e72265f2e6aa2854
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55465
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-15504 utils: lfs find -ls function 43/55443/11
Maximilian Dilger [Sun, 16 Jun 2024 01:55:46 +0000 (21:55 -0400)]
LU-15504 utils: lfs find -ls function

Added -ls function for lfs find. It is equivalent to using
printf "%i/t%k/t%M/t%n/t%u/t%g/t%s/t%t/t%p/n"

Signed-off-by: Maximilian Dilger <mdilger@whamcloud.com>
Change-Id: If84687915a2f71be81ee8adc5c9402371d635956
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55443
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Anjus George <georgea@ornl.gov>
3 months agoLU-14714 mgc: server to mount without local config 83/55283/8
Mikhail Pershin [Wed, 29 May 2024 18:27:13 +0000 (21:27 +0300)]
LU-14714 mgc: server to mount without local config

Server uses local config copy to mount with but
that is not the case always. Local config can be
damaged or empty and may be not updated from MGS,
e.g. due to -ENOSPC on server upon llog backup or
other errors. That causes empty or partial local
config so server unable to mount with it.

Patch allows a server to mount first from remote config
if local config wasn't copied from MGS. If remote
processing is not possible or failed then local config
is used as last attempt to mount.

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I442a4f20eeb7deb1b40ccc7cabb1fae65804e211
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55283
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-17779 lnet: use copy_page() 23/54923/4
Patrick Farrell [Fri, 26 Apr 2024 02:27:26 +0000 (22:27 -0400)]
LU-17779 lnet: use copy_page()

When copying one page to another in kernel memory, the
kernel has an optimized copy_page which can be used instead
of memcpy().

This is relevant in the lnet loopback subsystem, which
copies from the kiov from the client to that on the server.
(Using the same page is nasty for a lot of reasons, so
 copying is best.)

So we can check for the full page to full page copy and
use that.

On my little tiny VM system, this improves maximum write
performance (with the fake write fail_loc enabled) by about
20%, from 4.4 GiB/s to 5.7 GiB/s.

We should also eventually be able to add a fake copy to the
fake read/fake write fail loc, but that's a bit tricky, so
will be left out of this patch.

Signed-off-by: Patrick Farrell <patrick.farrell@oracle.com>
Change-Id: Iea89447ed03bd4646544883b588873700f6e09a4
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54923
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
3 months agoLU-10499 pcc: add pcc_mode parameter for permission check 22/54422/4
Qian Yingjin [Wed, 1 Sep 2021 13:45:47 +0000 (21:45 +0800)]
LU-10499 pcc: add pcc_mode parameter for permission check

This patch introduced a "llite.*.pcc_mode" parameter for PCC.
By this parameter, administrator can determine what file access
permissions should be allowed to bring files into PCC device for
caching.
This paramter is set with 0 by default.
Add sanity-pcc test_46 to verify it.

In this patch, it also ignores the EEXIST error when found that the
file had already attached into PCC during the manual attach.

EX-bug-id: EX-3741
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I1e006e4f723c1c177ae84c64ad32c6049a57110f
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54422
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-15837 utils: added lfs find --printf functions 41/55441/20
Maximilian Dilger [Wed, 12 Jun 2024 17:04:31 +0000 (13:04 -0400)]
LU-15837 utils: added lfs find --printf functions

adding functionality for:  %i (inode number)
     %M (symbolic file access mode)
     %g (groupname)
     %u (username)

Test-Parameters: trivial
Signed-off-by: Maximilian Dilger <mdilger@whamcloud.com>
Change-Id: I2a577c1c9869bfcda9aa20f60db65c7dc888204d
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55441
Tested-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-13814 osc: do not call osc_lru_use for transient 87/52087/22
Patrick Farrell [Fri, 23 Feb 2024 16:22:20 +0000 (11:22 -0500)]
LU-13814 osc: do not call osc_lru_use for transient

Transient pages are never added to the LRU, because they
can't be cached.  osc_lru_use already skips them because
they don't have the flag set, but make it explicit that
this is not called for transient pages.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <patrick.farrell@oracle.com>
Change-Id: I2c92ccb52380faefbcba3bfa35508dac2b601bd4
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52087
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
3 months agoLU-13814 osc: remove "osc_page_transfer_add" wrapper 86/52086/21
Patrick Farrell [Fri, 23 Feb 2024 16:21:49 +0000 (11:21 -0500)]
LU-13814 osc: remove "osc_page_transfer_add" wrapper

osc_page_transfer_add is just a wrapped around osc_lru_use,
let's remove it to be more explicit about what we're
actually doing.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <patrick.farrell@oracle.com>
Change-Id: I495a90bee7dc8f8c9d823fa47f9303d2fac2a829
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52086
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
3 months agoLU-13814 clio: remove cl_page_prep for transients 85/52085/21
Patrick Farrell [Fri, 23 Feb 2024 16:21:26 +0000 (11:21 -0500)]
LU-13814 clio: remove cl_page_prep for transients

cl_page_prep no longer does anything for transient pages,
finish cleaning that up and make it explicit with asserts.

Signed-off-by: Patrick Farrell <patrick.farrell@oracle.com>
Change-Id: I3285a69f971530ba0407de128430bb2497900d11
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52085
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
3 months agoLU-17151 tests: increase memcg limit on x86_64 68/55568/7
Qian Yingjin [Fri, 28 Jun 2024 02:41:47 +0000 (22:41 -0400)]
LU-17151 tests: increase memcg limit on x86_64

The x86_64 memcg limit of 384MB is quite small, often resulting in
failures on sanity/test_411b in some exterme situations when test
in VM environments.

To avoid the failures, we increase the limit with 1024MiB on both
x86_64 and arm systems.

And we also skip the test if the available OST storage space are
not enough.

Test-Parameters: trivial testlist=sanity env=ONLY=411b,ONLY_REPEAT=100
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I1064a4c2523b8d3e721f9712c17b829a9b1796dc
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55568
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-18019 tests: Convert hostname to IP list 07/55707/9
Chris Horn [Thu, 11 Jul 2024 18:51:18 +0000 (12:51 -0600)]
LU-18019 tests: Convert hostname to IP list

modify h2name_or_ip to convert the specified hostname into a
comma separated list of NIDs. If FORCE_LARGE_NID is true, and the host
has IPv6 addresses, then only those IPv6 NIDs are listed. If it is
false, and the host has IPv4 addresses, then only those IPv4 NIDs are
listed. Otherwise, we list NIDs based on whichever addresses are
present.

h2name_or_ip() may be called prior to init_test_env, and in this case
FORCE_LARGE_NID will not be initialized. Add FORCE_LARGE_NID to
cfg/local.sh to ensure it is set before h2name_or_ip() is called.

Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I0ef74eef5748d495e6b64f023c07d8eb4a23a5ed
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55707
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-17925 utils: fix 'lfs setstripe -c -1' interop 48/55548/11
Rajeev Mishra [Thu, 27 Jun 2024 03:37:18 +0000 (03:37 +0000)]
LU-17925 utils: fix 'lfs setstripe -c -1' interop

This issue was caused by the -C option was storing -1 as 0xffdf
and -33 was stored as 0xffff, which changed the on-disk file layout
compared to all older versions storing -1 as 0xffff. This resulted
in interop failure of 'lfs setstripe -c/-C' usage.

Instead, store stripe_count as -1 = 0xffff, and -32 = 0xffe0, like
normal unsigned short.  There is no need to distinguish "-c -1"
from "-C -1" in the layout since they both mean the same thing, to
allocate one object per OST.  However, don't allow "-c -2..-32"
since this has no meaning today.  Keep these in reserve in case we
assign some meaning to them in the future.

Restore LLAPI_LAYOUT_WIDE and LOV_ALL_STRIPES definitions, since
these are part of the API and may be used by userspace applications.

Rename LOV_ALL_STRIPES_MIN to LOV_ALL_STRIPES_WIDE since it was
confusing that the "MIN" value created files with the most stripes.

Rename LOV_V1_INSANE_STRIPE_COUNT to LOV_V1_INSANE_STRIPE_INDEX
since it is really the maximum OST index and not a stripe count.

Test-Parameters: testlist=sanity-flr serverversion=2.15 env=SANITY_FLR_EXCEPT="0k 44e 205"
Fixes: 1a6ef725c285 ("LU-16938 utils: setstripe overstripe multiple OST count")
Signed-off-by: Rajeev Mishra <rajeevm@hpe.com>
Change-Id: I3908bd5a70aa35305ef8f278fb0346319055e5b3
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55548
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
3 months agoLU-17905 kernel: new kernel [SLES15 SP6 6.4.0-150600.23.14.2] 63/55563/14
Jian Yu [Thu, 25 Jul 2024 17:27:34 +0000 (10:27 -0700)]
LU-17905 kernel: new kernel [SLES15 SP6 6.4.0-150600.23.14.2]

This patch makes changes to support new SLES15 SP6 release
with kernel 6.4.0-150600.23.14.2 for Lustre client.

In Lustre test suites, there are some subtests using filefrag
from Lustre-patched e2fsprogs. This patch adds checks in those
subtests to skip them if the Lustre-patched e2fsprogs is not
installed on Lustre client.

Test-Parameters: trivial mdtcount=4 mdscount=2 \
  env=SANITY_EXCEPT="27J 103a 244a" \
  clientdistro=sles15sp6 testlist=sanity

Test-Parameters: optional clientdistro=sles15sp6 testgroup=full-part-1
Test-Parameters: optional clientdistro=sles15sp6 testgroup=full-part-2
Test-Parameters: optional clientdistro=sles15sp6 testgroup=full-part-3

Change-Id: Ib9159d200122595d0a56e3581cfc66d75ddb59f6
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55563
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-15593 tests: fix sanity-flr/44e interop check 45/55845/3
Frederick Dilger [Tue, 23 Jul 2024 18:55:37 +0000 (12:55 -0600)]
LU-15593 tests: fix sanity-flr/44e interop check

sanity-flr.sh test_44e was failing interop testing with 2.15 servers
because the version check was 2.14.52 when the patch didn't land
until v2_15_50-155-ga3f1c4622a. Update the interop check accordingly.

Test-Parameters: trivial testlist=sanity-flr
Test-Parameters: testlist=sanity-flr env=ONLY=44 serverversion=2.15
Fixes: a3f1c4622a ("LU-15593 mdt: Add option to disable use of SOM")
Signed-off-by: Frederick Dilger <fdilger@whamcloud.com>
Change-Id: I432078410003bea260032473729d89c9e174330f
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55845
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 months agoLU-18056 osp: fix KASAN warning when lu2dt_dev() lwp device 25/55825/2
Timothy Day [Sun, 21 Jul 2024 21:47:29 +0000 (17:47 -0400)]
LU-18056 osp: fix KASAN warning when lu2dt_dev() lwp device

A lwp device isn't really a dt device, so it shouldn't be
marked as such. This causes a KASAN warning when obd_setup()
attempts to access dt device fields using lu2dt_dev():

 BUG: KASAN: slab-out-of-bounds in obd_setup+0x208/0x4b0 [obdclass]

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Ib52b0f93c35a7d966314b6375ee963bc59f86abb
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55825
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-17685 tests: skip nocompr test for older versions 16/55816/3
Alexandre Ioffe [Fri, 19 Jul 2024 23:43:01 +0000 (16:43 -0700)]
LU-17685 tests: skip nocompr test for older versions

nocompr flag in lfs mirror extend is supported
after MDS version v2_15_61-245-g37e1316050

Fixes: 37e1316050 ("LU-17685 utils: Allow nocompr flag in lfs mirror extend")
Test-Parameters: trivial testlist=sanity-flr env=ONLY="205a 205b"
Test-Parameters: trivial testlist=sanity-flr env=ONLY="205a 205b"
Test-Parameters: trivial testlist=sanity-flr env=ONLY="205a 205b"
Signed-off-by: Alexandre Ioffe <aioffe@ddn.com>
Change-Id: I5c23e549893b9f6dbc016b79d4a6601a8320bf94
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55816
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Reviewed-by: Colin Faber <cfaber@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-18027 lfs: lfs find handling special files 09/55709/15
Caleb Carlson [Tue, 2 Apr 2024 16:29:48 +0000 (10:29 -0600)]
LU-18027 lfs: lfs find handling special files

This replaces fget_projid() with get_projid(),
a newer function that is able to get project ids on
regular files, directories, symbolic links, and
special file types. This adopts the same approach to
getting file attributes that lfs_project.c uses in
project_get_fsxattr() by getting a special file's
attributes by the containing directory.

This patch also replaces the exclusion logic for
the --projid flag, reducing the depth count of if/else
clauses by checking a boolean equivalence statement.

Boolean algebra table for why this works (skip means
we don't print, include means we do):

matches | exclude_projid | matches == exclude_projid |
------------------------------------------------------
   1    |       1        |            1 (skip)       |
   1    |       0        |            0 (include)    |
   0    |       1        |            0 (include)    |
   0    |       0        |            1 (skip)       |

Finally, this patch replaces the INVALID_PROJID = -1
definition with DEFAULT_PROJID = 0. A major reason for
this is that projid is defined as an unsigned, 32-bit
integer, so storing a negative value in it doesn't make
much sense. With this patch we should be able to get
the project id for every file type (special or not),
so we no longer need a case for invalid projid.

.gitignore .vscode/settings.json files.

Add test_56ei in sanity.sh that sets a project id
on some special files and checks that lfs find --printf
picks up the file, and prints the correct project
id.

Updates error messages in test_56rd to not assume
we can't get projid from special file types.

Fix bug where we were only printing file types.

HPE-bug-id: LUS-12195
Signed-off-by: Caleb Carlson <caleb.carlson@hpe.com>
Test-Parameters: testlist=sanity
Change-Id: I8b10044ea62d662d6f9388725c0e93d55a43b431
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55709
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: corey tesdahl <corey.tesdahl@hpe.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Sakib Samar <sakib.samar@hpe.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-18019 tests: Check for large NIDs in load_modules_local 94/55694/6
Chris Horn [Wed, 10 Jul 2024 16:39:52 +0000 (10:39 -0600)]
LU-18019 tests: Check for large NIDs in load_modules_local

Enforce FORCE_LARGE_NID in load_modules_local by calling load_lnet()
with config_on_load=1. This will force large NID configuration (if
large NIDs are assigned to LNet interfaces), but should not alter
the existing behavior when large NIDs are not assigned to LNet
interfaces.

Test-Parameters: trivial
Fixes: c8ca47daac ("LU-16822 tests: Force IPv6 testing in mixed environment")
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I66d201c61c59246486f2aa60ab2338bd9e5317b6
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55694
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
3 months agoLU-13403 utils: mirror-count not required for mirror extend 77/55677/7
Frederick Dilger [Tue, 9 Jul 2024 19:11:20 +0000 (13:11 -0600)]
LU-13403 utils: mirror-count not required for mirror extend

If [--mirror-count|-N] is not specified for 'lfs mirror extend', '-N'
will be added to the option arguments before lfs_setstripe_internal()
is called.

The lustre manual states for 'lfs mirror extend':
    "The mirror_count argument is optional and default to 1 if it is
     not specified."
Which can be interpretend as [--mirror-count|-N] does not need to be
specified rather than the MIRROR_COUNT being optional.

It also makes sense that someone who is using 'lfs mirror extend' in
fact intends to extend the mirror count, so it would make sense to
have a default for mirror-count without it having to be specified.

Signed-off-by: Frederick Dilger <fdilger@whamcloud.com>
Change-Id: I01bea6cec71fbf61c617cf27a52d7fb24fd4b06d
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55677
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-18005 gss: allow regular users to authenticate on MGS 40/55640/5
Sebastien Buisson [Fri, 5 Jul 2024 14:51:44 +0000 (16:51 +0200)]
LU-18005 gss: allow regular users to authenticate on MGS

It can be useful for regular users to be able to authenticate against
the MGS, for instance to run 'lfs check mgts'.
Just allow this type of authentication request in the code, and take
into account the MGC export when doing 'lfs flushctx', so that any
user key associated with the MGS gets flushed as well.

Add sanity-krb5 test_152 to exercise this capability.

Test-Parameters: trivial
Test-Parameters: testgroup=review-dne-selinux-ssk-part-1
Test-Parameters: testgroup=review-dne-selinux-ssk-part-2
Test-Parameters: kerberos=true testlist=sanity-krb5
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I4871b4ed61af918644e11d64ef5750a858713323
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55640
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Aurelien Degremont <adegremont@nvidia.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-18002 build: proper MOFED detection with multiple installations 25/55625/4
Aurelien Degremont [Wed, 3 Jul 2024 13:43:36 +0000 (15:43 +0200)]
LU-18002 build: proper MOFED detection with multiple installations

For build step, OFED detection is based on header path detected
from the locally installed packages.

Building with multiple OFED headers installed (for different
kernels by example) is broken since v2_15_63-60-g0e9708016b.

This patch is at least fixing the Ubuntu case, and the EL
case is not changed, what was working is still working, but
the broken case is still broken.

Also simplify the package query a little bit.

Fixes: 0e9708016 ("LU-16819 build: use mofed path based on target kernel")
Signed-off-by: Aurelien Degremont <adegremont@nvidia.com>
Change-Id: Ib15f971ea745d9deded6288e3ed4663bdd385da0
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55625
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: ake sandgren <ake.sandgren@hpc2n.umu.se>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-9544 utils: set -P if missing in 'set_param -d' 99/55399/18
Frederick Dilger [Tue, 11 Jun 2024 21:35:35 +0000 (17:35 -0400)]
LU-9544 utils: set -P if missing in 'set_param -d'

The -P option to lctl set_param will now be added if
the -d option (for delete) is specified by itself.

As described in the ticket, if a value is erroneously supplied when
using -P and -d then instead of being deleted, the parameter is
set to the old value with a trailing '='. A non-regression test
has been created to verify that this isn't happening.

wait_update_cond and wait_update were modified to avoid unresolved
errors from throwing an unwanted errnum during the wait. They still
operate with logical equivalence to before.

Signed-off-by: Frederick Dilger <fdilger@whamcloud.com>
Change-Id: If35b2e9db51f7296da25b798205b9f9104830bca
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55399
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
3 months agoLU-10499 pcc: add test for concurrent read from 2 clients 19/54419/10
Qian Yingjin [Tue, 31 Aug 2021 03:45:25 +0000 (11:45 +0800)]
LU-10499 pcc: add test for concurrent read from 2 clients

This patch add a test case with concurrent read access from 2
clients.
The purpose is to verify that the client will not re-attach file
into PCC backend once attached when the file is read access
concurrently from 2 mount points on a client according to the
PCC attach stats.

EX-bug-id: EX-3730
Test-Parameters: trivial
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: Ibb038bd3a74f43031b6fab4e65565620c416909e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54419
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-17628 lfs: add lfs_setstripe admin restrict 41/54341/12
Patrick Farrell [Mon, 1 Apr 2024 16:12:00 +0000 (12:12 -0400)]
LU-17628 lfs: add lfs_setstripe admin restrict

In some settings, it's not desirable for users to be able
to set their own striping.  This is purely a 'convenience'
restriction, where the admin prefers users not set their
own striping to avoid user error, and not a security
restriction.  This is for sites which have a sensible
default striping and prefer users not modify it.

The goal here is to avoid user error.  However, some
applications use the Lustre API to set their own striping,
and it's not desirable for such applications to fail.

So setstripe fails with an error for the 'lfs' binary, and
is silently ignored in other cases.  In all cases, the file
is created with the default layout.

Note we return EACESS for this case rather than EPERM
because EPERM is already returned (via a special case) for
setting layout on the root of the file system.  This is a
distinct case because we do not want the special group
created by this patch to be able to set the root filesystem
layout, so the root of the FS continues to be handled
separately.

Signed-off-by: Patrick Farrell <patrick.farrell@oracle.com>
Change-Id: Id2e8dd175f5e3870f3aa64b69556308706d5317c
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54341
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: wangdi <di.d.wang@oracle.com>
3 months agoLU-18017 lnet: Swapped put/get stats in net show 89/55689/2
Cyril Bordage [Wed, 10 Jul 2024 13:16:00 +0000 (15:16 +0200)]
LU-18017 lnet: Swapped put/get stats in net show

Inverse LNET_NET_LOCAL_NI_MSG_STATS_ATTR_PUT_COUNT and
LNET_NET_LOCAL_NI_MSG_STATS_ATTR_GET_COUNT in lnet_net_show_dump.

Signed-off-by: Cyril Bordage <cbordage@whamcloud.com>
Fixes: d15bfca0785 ("LU-10391 lnet: migrate full LNet NI information collection")
Test-Parameters: trivial
Change-Id: Ib89bbae62a67c51c24c53d2634d4a00cdff9efeb
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55689
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 months agoLU-17999 lnet: prevent race in access to peer rtrcredits count 20/55620/9
Serguei Smirnov [Thu, 4 Jul 2024 00:02:32 +0000 (17:02 -0700)]
LU-17999 lnet: prevent race in access to peer rtrcredits count

Refactor lnet_parse_forward_locked and lnet_post_routed_recv_locked
to have the code which checks and acts on peer rtrcredits in a single
spot, in order to avoid the race when the count is decremented
(by another thread) after being checked initially for the purpose of
"eager receiving" the message, which might cause an assert on
msg_rx_ready_delay to get triggered.

This race is possible if messages from the same peer NID are being
processed on different local NIs mapped to different CPTs.

Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: Ibe938882a69d860554cd9c875403bfb0399df8ec
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55620
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 months agoLU-15899 mdt: mdt_hsm_release default lov buffer 01/55801/2
Shaun Tancheff [Fri, 19 Jul 2024 04:07:29 +0000 (11:07 +0700)]
LU-15899 mdt: mdt_hsm_release default lov buffer

If LOV is not set during mdt_hsm_release the md_attr needs
to be assigned a large enough buffer to define lov default
values.

Since the md_attr is possibly not populated at this point
use the common mti_xattr_buf to hold the values.

Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I3e1c2f751bb031fdf3ea0d8583213cd5c81a57d7
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55801
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-18066 obdclass: reorder s1 string instead of options string 39/55839/2
James Simmons [Tue, 23 Jul 2024 14:09:09 +0000 (10:09 -0400)]
LU-18066 obdclass: reorder s1 string instead of options string

Testing with client mounts with multiple MGS NIDs exposed a bug
in that mount options could be lost. In this case user_xattr
from our testing. The mistake was moving the s2 string into the
wrong string; options; instead of s1.

Fixes: 415fa27540 ("LU-9325 obdclass: use match_table for server mount options")
Change-Id: I0a5e1511d558e4600a009b7e7820f68d399bcc21
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55839
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Maximilian Dilger <mdilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-13814 clio: cleanup cl_page_completion 84/52084/21
Patrick Farrell [Fri, 23 Feb 2024 16:19:59 +0000 (11:19 -0500)]
LU-13814 clio: cleanup cl_page_completion

Clean up cl_page_completion and make very explicit which
parts of the function do not apply to transient pages.

Signed-off-by: Patrick Farrell <patrick.farrell@oracle.com>
Change-Id: Ic288717c8487ff963f0fa7f63a943e72d05d129a
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52084
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
3 months agoLU-13814 llite: note references in direct_rw_pages 83/52083/21
Patrick Farrell [Fri, 23 Feb 2024 16:18:18 +0000 (11:18 -0500)]
LU-13814 llite: note references in direct_rw_pages

Add a comment denoting the function of cl_2queue_fini in
ll_direct_rw_pages.  This will eventually be removed as
part of this process, but this comment serves as a remidner
for when we get to that step.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <patrick.farrell@oracle.com>
Change-Id: I98a03c7ee0d97665d77a321bc21b4fab6448b2d7
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52083
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
3 months agoLU-13814 clio: further transient own/disown removal 82/52082/20
Patrick Farrell [Fri, 12 Jul 2024 15:53:56 +0000 (11:53 -0400)]
LU-13814 clio: further transient own/disown removal

This patch goes a bit further in removing own/disown for
transient pages, including adding asserts that the code is
not called for transient pages.

Signed-off-by: Patrick Farrell <patrick.farrell@oracle.com>
Change-Id: Ibad555e50088adc53a66d0e4aeba5558ac2b6a06
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52082
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-13814 clio: remove cl_page_delete for transient 80/52080/20
Patrick Farrell [Wed, 29 May 2024 15:16:52 +0000 (11:16 -0400)]
LU-13814 clio: remove cl_page_delete for transient

cl_page_delete no longer does anything for transient pages,
so do not call it for them.

Signed-off-by: Patrick Farrell <patrick.farrell@oracle.com>
Change-Id: I666cceaa1f05bf7e86fb60782eb573d5714051ee
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52080
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
3 months agoLU-13814 clio: remove discard for transient pages 81/52081/22
Patrick Farrell [Wed, 29 May 2024 15:16:39 +0000 (11:16 -0400)]
LU-13814 clio: remove discard for transient pages

With cl_page_delete removed for transient pages, now
cl_page_discard doesn't do anything for transient pages.

Remove it.

Signed-off-by: Patrick Farrell <patrick.farrell@oracle.com>
Change-Id: Ib62e8a5aea71669428b7b61ba989867702bf1758
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52081
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 months agoLU-18045 mdt: do extra MDT cleanup before barrier 91/55791/3
Mikhail Pershin [Thu, 11 Jul 2024 16:48:40 +0000 (19:48 +0300)]
LU-18045 mdt: do extra MDT cleanup before barrier

In osp_disconnect() do namespace cleanup to don't
leave OSP locks pinning obd_export_barrier()

In mdt_fini() call target_recovery_fini() and
mdt_quota_fini() before calling obd_export_barrier()

Fixes: ffedcbae21 ("LU-17809 osp: make disconnect asynchronous")
Change-Id: I97bf7915cf8b77e26b2a8f1ba41c6128575bd06b
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55791
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 months agoLU-16011 lnet: remove LBUG() in srpc_client_rpc_expired() 85/55785/2
Timothy Day [Wed, 17 Jul 2024 18:24:10 +0000 (18:24 +0000)]
LU-16011 lnet: remove LBUG() in srpc_client_rpc_expired()

We shouldn't crash just because an RPC expired.

Fixes: e5026380 ("LU-16011 lnet: use preallocate bulk for server")
Test-Parameters: trivial
Test-Parameters: testgroup=review-ldiskfs-arm testlist=sanity-lnet,lnet-selftest
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Ia4e9bf7f688b2920c91444f57f7a7da2b1f89a67
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55785
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-18006 osc: rework sdio locking 49/55649/7
Patrick Farrell [Sat, 6 Jul 2024 18:10:26 +0000 (14:10 -0400)]
LU-18006 osc: rework sdio locking

We cannot hold a spinlock across a call to kthread_create,
so we have to rearrange the locking for this, moving the
locking inside the data copy function.  This handles the
multithreading in a slightly different spot.

Test-Parameters: testlist=sanity env=ONLY=119f,ONLY_REPEAT=100
Test-Parameters: testlist=sanity env=ONLY=119f,ONLY_REPEAT=100
Signed-off-by: Patrick Farrell <patrick.farrell@oracle.com>
Change-Id: I30c1c6600ebfe2e1fb54606608ea55469ae06937
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55649
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-17367 utils: improved hostname and IPv6 NID support 06/55706/9
Maximilian Dilger [Thu, 11 Jul 2024 19:07:17 +0000 (15:07 -0400)]
LU-17367 utils: improved hostname and IPv6 NID support

Improved NID parsing to accept hostnames, IPv4 and IPv6 addresses.
IPv6 addresses need @network to be parsed properly. The following
is a valid IPv6 address: "6699:7654::1234:1234:d84@tcp"

All other non IPv6 addresses can be passed in without a network
identifier.

Signed-off-by: Maximilian Dilger <mdilger@whamcloud.com>
Change-Id: I131c8b2d1f7b0fe593564af90308d2d3ea278a0c
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55706
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
3 months agoLU-14992 mdt: restore mkdir VBR support 14/55714/3
Lai Siyao [Sun, 2 Jun 2024 16:02:45 +0000 (12:02 -0400)]
LU-14992 mdt: restore mkdir VBR support

The patch of LU-14470 (striped mkdir replay by client request) broke
the mkdir VBR support: in mkdir replay, if target exists, it should
do version check instead of return -EEXIST directly, otherwise the
VBR support is broken.

Fixes: a2e997f0be ("LU-14470 dne: striped mkdir replay by client request")
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I858499f3ef5315bbce9538733400cf6102675e4c
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55714
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-18020 tests: define CONFIG for do_rpc_nodes 95/55695/2
Chris Horn [Wed, 10 Jul 2024 17:03:49 +0000 (11:03 -0600)]
LU-18020 tests: define CONFIG for do_rpc_nodes

This allows correct test environment definitions when CONFIG is
defined but not NAME. This is useful when configuration files live
some place other than lustre/tests/cfg

Test-Parameters: trivial
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: If1be2123d5fc60a1bdf210627bc5d0b9bdbf3daa
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55695
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Elena <elena.gryaznova@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-17853 ptlrpc: Negative value for req_waittime 05/55605/6
Frederick Dilger [Mon, 1 Jul 2024 19:54:44 +0000 (13:54 -0600)]
LU-17853 ptlrpc: Negative value for req_waittime

A negative value was being reported in req_waittime mdt. This was
likely caused by a backwards adjustment of a few microseconds and
would cause a negative time delta to be calculated.

Fixed negative wait times by setting the time delta to be 1 if
the time delta was calculated to be negative. This should have
minimal impact on the statistics.

Signed-off-by: Frederick Dilger <fdilger@whamcloud.com>
Change-Id: I48543b8b1fbc83829421a30f4f7be7da8b681132
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55605
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 months agoLU-17922 utils: added idmap range functionality 02/55502/11
Maximilian Dilger [Thu, 20 Jun 2024 20:09:15 +0000 (16:09 -0400)]
LU-17922 utils: added idmap range functionality

Added the ability to a declare a range when adding idmaps to a
nodemap. The syntax is:
 <clientid_start>-<clientid_end>:<fsid_start>[-<fsid_end>]

The uid_end value is optional. In practice this looks like:

nodemap_add_idmap --name test --idtype uid --idmap 500-510:10000

It is also now possible to delete idmap ranges with the
nodemap_del_idmap command as well with the same syntax.

Signed-off-by: Maximilian Dilger <mdilger@whamcloud.com>
Change-Id: If6a2b9ab11f7d435a6854055001e6102aac43115
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55502
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
3 months agoLU-17512 utils: new ? operator for jobid_name 32/55332/13
Maximilian Dilger [Thu, 6 Jun 2024 05:27:05 +0000 (01:27 -0400)]
LU-17512 utils: new ? operator for jobid_name

Added new ? operator when setting the jobid_name. The intended use
is: "jobid_name=%j?%H" This will use the jobid if it is available
and otherwise uses the short hostname.

Signed-off-by: Maximilian Dilger <mdilger@whamcloud.com>
Change-Id: I418860fce5a81aa8a0a0a43c2d8bdb6d107779f9
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55332
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Thomas Bertschinger <bertschinger@lanl.gov>
3 months agoLU-16741 llite: rename ptlrpc_req_finished for component llite 85/54985/4
Arshad Hussain [Thu, 2 May 2024 10:18:09 +0000 (06:18 -0400)]
LU-16741 llite: rename ptlrpc_req_finished for component llite

Patch renames ptlrpc_req_finished to ptlrpc_req_put for
llite component

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I216aa2797fbebeecae82b1d45301df7a860bde65
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54985
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <patrick.farrell@oracle.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-17812 ldlm: stack trace log for LDLM error 96/54896/18
Rajeev Mishra [Tue, 26 Mar 2024 02:15:31 +0000 (02:15 +0000)]
LU-17812 ldlm: stack trace log for LDLM error

Added support to dump the stack trace in
ldlm_lock_debug(), the stack trace is logged only
for the case of D_ERROR and and when dump_stack_on_error
is enabled

Test-Parameters: testlist=sanity env=ONLY=105g
HPE-bug-id: LUS-12165
Signed-off-by: Rajeev Mishra <rajeevm@hpe.com>
Change-Id: I4ce280334e0273df1751257e8db03ea680831696
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54896
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 months agoLU-17760 lnet: Crash caused by uninitialized interface name 59/54859/13
Frank Sehr [Fri, 19 Apr 2024 22:33:12 +0000 (18:33 -0400)]
LU-17760 lnet: Crash caused by uninitialized interface name

When adding an interface with ip2net, a duplicate configuration of an
already existing interface can cause a crash or misconfiguration of
lnet. Incoming interface names have to be checked if they are null and
furthermore duplicate interface configurations have to be removed.
When a duplicate is detected add has to be added to a list to be able
to shut it down otherwise shutdown would assert.
The problem can be repoduced on tcp and o2ib networks.
Steps that were used to reproduce the problem in the original
configuration, but it is reproducable in other variations and
in tcp networks.
modprobe lnet
lnetctl lnet configure
lnetctl net add --net  o2ib --if mlxib1
lnetctl net add --net  o2ib --if mlxib1
       --ip2net "o2ib 172.30.12.*"

Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Frank Sehr <fsehr@whamcloud.com>
Change-Id: Ie76d97cc52855ab897a9e07a3697483189d4b19e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54859
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-6142 lnet: Fix style issues for conrpc.c 34/55734/2
Arshad Hussain [Mon, 15 Jul 2024 08:29:42 +0000 (04:29 -0400)]
LU-6142 lnet: Fix style issues for conrpc.c

This patch fixes issues reported by checkpatch
for file lnet/selftest/conrpc.c

Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: Icd8a9ffffd34c3330fc7c710359bcaf7f197ea52
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55734
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-6142 lnet: Fix style issues for brw_test.c 33/55733/2
Arshad Hussain [Mon, 15 Jul 2024 07:51:36 +0000 (03:51 -0400)]
LU-6142 lnet: Fix style issues for brw_test.c

This patch fixes issues reported by checkpatch
for file lnet/selftest/brw_test.c

Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I6ccda68a9becf44801e3623acac30ce4c5804374
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55733
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-6142 lnet: Fix style issues for rpc.h 32/55732/2
Arshad Hussain [Mon, 15 Jul 2024 08:55:36 +0000 (04:55 -0400)]
LU-6142 lnet: Fix style issues for rpc.h

This patch fixes issues reported by checkpatch
for file lnet/selftest/rpc.h

Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I99524bd815936c95d048a7617acfde3327d8d5e1
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55732
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-6142 lnet: Fix style issues for ping_test.c 31/55731/3
Arshad Hussain [Mon, 15 Jul 2024 09:07:11 +0000 (05:07 -0400)]
LU-6142 lnet: Fix style issues for ping_test.c

This patch fixes issues reported by checkpatch
for file lnet/selftest/ping_test.c

Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I828ce5ccf6bfc9868fc7a8f9fc9bcb8a9293d118
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55731
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-17974 quota: fix qmt_pool_lqes_lookup_spec 35/55535/2
Sergey Cheremencev [Tue, 25 Jun 2024 19:52:21 +0000 (22:52 +0300)]
LU-17974 quota: fix qmt_pool_lqes_lookup_spec

Return 0 from qmt_pool_lqes_lookup_spec if
between found lqes exists global lqe. And
return -ENOENT if
* no lqes have been found
* no global lqe between found lqes
This patch aimed to prevent below panic:

 (qmt_lock.c:957:qmt_id_lock_notify())
ASSERTION( lqe->lqe_is_global ) failed:
 (qmt_lock.c:957:qmt_id_lock_notify()) LBUG
 ...
 Call Trace TBD:
 libcfs_call_trace+0x6f/0xa0 [libcfs]
 lbug_with_loc+0x3f/0x70 [libcfs]
 qmt_id_lock_notify+0x1ee/0x330 [lquota]
 qmt_site_recalc_cb+0x34b/0x550 [lquota]
 cfs_hash_for_each_tight+0x122/0x310 [libcfs]
 qmt_pool_recalc+0x375/0xa80 [lquota]
 kthread+0x134/0x150
 ret_from_fork+0x35/0x40
 Kernel panic - not syncing: LBUG

Signed-off-by: Sergey Cheremencev <scherementsev@ddn.com>
Change-Id: I62a2175b7b05c49f28b4e87c36ed653d1b9a71cc
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55535
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-17635 lfsck: detect missing LMV hash 64/55364/3
Alexander Zarochentsev [Fri, 7 Jun 2024 17:19:08 +0000 (17:19 +0000)]
LU-17635 lfsck: detect missing LMV hash

Detect striped dirs with a missing LMV hash,
attempting to set it for trivial cases
mark BAD_TYPE otherwise.

HPE-bug-id: LUS-12379
Signed-off-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Change-Id: Ibce4dd9cf01d653c431f7b7968691a4d704af9d9
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55364
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-18010 build: remove dpatch from dependency 11/55711/2
Shuichi Ihara [Thu, 11 Jul 2024 21:53:02 +0000 (06:53 +0900)]
LU-18010 build: remove dpatch from dependency

dpatch is no longer available in ubuntu24.04.
Let's remove from dependency. if it really needs, use quilt instead.

Test-Parameters: trivial
Signed-off-by: Shuichi Ihara <sihara@ddn.com>
Change-Id: I94939ec6fe87fdbfe2a5904298d90ec324796921
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55711
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 months agoLU-18018 kernel: update RHEL 9.4 [5.14.0-427.24.1.el9_4] 85/55685/3
Jian Yu [Wed, 10 Jul 2024 06:53:41 +0000 (23:53 -0700)]
LU-18018 kernel: update RHEL 9.4 [5.14.0-427.24.1.el9_4]

Update RHEL 9.4 kernel to 5.14.0-427.24.1.el9_4.

Test-Parameters: trivial fstype=ldiskfs mdtcount=4 mdscount=2 \
  clientdistro=el9.4 serverdistro=el9.3 testlist=sanity

Test-Parameters: trivial fstype=zfs mdtcount=4 mdscount=2 \
  clientdistro=el9.4 serverdistro=el9.3 testlist=sanity

Test-Parameters: trivial fstype=ldiskfs mdtcount=4 mdscount=2 \
  clientdistro=el9.3 serverdistro=el9.4 testlist=sanity

Test-Parameters: trivial fstype=zfs mdtcount=4 mdscount=2 \
  clientdistro=el9.3 serverdistro=el9.4 testlist=sanity

Test-Parameters: optional clientdistro=el9.4 serverdistro=el9.4 \
  testgroup=full-part-1

Test-Parameters: optional clientdistro=el9.4 serverdistro=el9.4 \
  testgroup=full-part-2

Test-Parameters: optional clientdistro=el9.4 serverdistro=el9.4 \
  testgroup=full-part-3

Change-Id: If795f9b12a4c7f7eac14b0d38c8078c0013d64da
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55685
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Alex Deiter
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-930 doc: fix lfs-project.1 quoting 80/55680/2
Andreas Dilger [Tue, 9 Jul 2024 21:48:15 +0000 (15:48 -0600)]
LU-930 doc: fix lfs-project.1 quoting

Fix the quoting for the "-0" description, which otherwise is
formatted badly.  Improve the description to give more direction
as to the intended usage of this option.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Id32c542a0697bc0c3c79775051d98d05be4ece5f
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55680
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-18011 pcc: fix build failure in ->fileattr_set() 71/55671/4
Qian Yingjin [Tue, 9 Jul 2024 08:31:45 +0000 (04:31 -0400)]
LU-18011 pcc: fix build failure in ->fileattr_set()

The build failed on linux-6.8 kernel:
rc = inode->i_op->fileattr_set(&init_user_ns, dentry, &fa);
       ^~~~~~~~~~~~~
       |
       struct user_namespace *
pcc.c:3265:40: note: expected 'struct mnt_idmap' but argument is
of type 'struct user_namespace'.

Replace "&init_user_ns" with "&nop_mnt_idmap" to fix the build
error.

Fixes: 2d1a906ff11 ("LU-12358 pcc: add project quota support on PCC backend")
Test-Parameters: trivial testlist=sanity-pcc
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: Ib79d79fa1aa6e99719d1658cdc4c03e1fa1ea064
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55671
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Shuichi Ihara <sihara@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 months agoLU-17714 gss: support revoked session keyring 27/55627/3
Sebastien Buisson [Thu, 4 Jul 2024 15:09:23 +0000 (17:09 +0200)]
LU-17714 gss: support revoked session keyring

In case the session keyring for a regular user has been revoked, the
key ends up being linked to the user session keyring. So we must
detect this case and properly unlink the key from the correct keyring.
This applies to the initial key creation workflow, as well as to the
explicit context flush ('lfs flushctx').

Add sanity-krb5 test_10 to exercise this capability.

Test-Parameters: trivial
Test-Parameters: testgroup=review-dne-selinux-ssk-part-1
Test-Parameters: testgroup=review-dne-selinux-ssk-part-2
Test-Parameters: kerberos=true testlist=sanity-krb5
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: If96703a2de9a4172613bfbd96e7529b16169cf58
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55627
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Aurelien Degremont <adegremont@nvidia.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-17940 gss: get rid of root key in all cases 55/55555/6
Sebastien Buisson [Thu, 27 Jun 2024 15:20:58 +0000 (17:20 +0200)]
LU-17940 gss: get rid of root key in all cases

The root key associated with a GSS context (gck_key) is used to pass
information between kernel and userspace during GSS context
negotiation.
Whether the GSS context negotiation went well or not, the context and
the key used in this process should be unbound once done. And this
should mean unlinking the key but also directly invalidating it
instead of just revoking it, to make sure the key is ignored by all
searches and other operations.
For the same reasons, invalidate the key when the GSS upcall times
out or the context pre-initilization fails.

Test-Parameters: trivial
Test-Parameters: testgroup=review-dne-selinux-ssk-part-1
Test-Parameters: testgroup=review-dne-selinux-ssk-part-2
Test-Parameters: kerberos=true testlist=sanity-krb5
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I8b61d22e942d0dca16b96780889976c3a5f00f6a
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55555
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Aurelien Degremont <adegremont@nvidia.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-17971 gss: do not make lsvcgss record its PID 09/55509/2
Sebastien Buisson [Mon, 24 Jun 2024 07:32:35 +0000 (09:32 +0200)]
LU-17971 gss: do not make lsvcgss record its PID

The lsvcgssd daemon is expected to spawn a few additional threads at
startup to carry out extra work. In this case finding the PID of the
'main' thread can be complicated.
So do not try to record this by ourselves, and let systemctl handle
that.

Test-Parameters: trivial
Test-Parameters: testgroup=review-dne-selinux-ssk-part-1
Test-Parameters: testgroup=review-dne-selinux-ssk-part-2
Test-Parameters: kerberos=true testlist=sanity-krb5
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I7ddfcd5b5f3c69a46079b42d76fb9585953e30b1
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55509
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-17689 o2iblnd: handle unexpected network data gracefully 01/55501/5
Serguei Smirnov [Fri, 21 Jun 2024 17:40:20 +0000 (10:40 -0700)]
LU-17689 o2iblnd: handle unexpected network data gracefully

Remove assertions in favour of graceful handling of
unexpected data coming in: prefer to report and handle the error
and carry on.

Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: I62dc260e781ab0d2a5069560ca05f692a612bb8f
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55501
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-17928 lnet: add kmod devel package 87/55387/7
Shaun Tancheff [Wed, 10 Jul 2024 03:20:01 +0000 (10:20 +0700)]
LU-17928 lnet: add kmod devel package

This creates a new kernel module development package for building
kernel modules that depend on the Lustre/LNet kAPI

The most notable of these is DVS which uses LNet

Along with the kernel includes add a package config file: lnet.pc
and the Module.symvers needed for linking against Lustre/LNet kAPI

Use:
   pkg-config --variable=symversdir lnet
to find the path to Module.symvers and include files.

In addition the dkms build can differ enough that the packaged
Module.symvers and config.h (and possibly the headers as well)
may diff enough that they are not interchangeable.

Use the update-alternatives subsystem to enable the dkms and kmp
packages to co-exist and the kmp-devel package to work with either.

Also loosens user space requirement to require:
 Lustre version >= major.minor
and not the exact build

Test-Parameters: trivial
HPE-bug-id: LUS-12246, LUS-12378, LUS-12351
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Signed-off-by: Caleb Carlson <caleb.carlson@hpe.com>
Change-Id: Idb00b881e8f6d4a703cc71fd0d8768e1f433fca3
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55387
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-17899 gss: improved systemd unit file for SSK daemon 79/55379/4
Chris Hunter [Thu, 6 Jun 2024 05:44:12 +0000 (01:44 -0400)]
LU-17899 gss: improved systemd unit file for SSK daemon

Add operation ordering to lsvcgss initscript/service unit
so it starts after systemd network services are running.

Signed-off-by: Chris Hunter <chunter@ddn.com>
Change-Id: Iad39d01aae16732ff646383814033d6efb34af5e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55379
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Peter Jones <pjones@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 months agoLU-12597 tests: allow comma-separated node lists 40/54340/5
Andreas Dilger [Tue, 1 Aug 2023 21:18:12 +0000 (15:18 -0600)]
LU-12597 tests: allow comma-separated node lists

Allow some functions that deal with space-separated node lists to
also accept comma-separated node lists, to prepare for a future
where $(osts_nodes) and $(mdts_nodes) will return comma-separated
lists already, instead of having to call comma_list each time.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I36e5a2a0814fd6564ca560ad93fdaba0423ebbe5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54340
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-16959 lnet: auto-tune ARP-related sysctl setting 10/53310/24
Frank Sehr [Fri, 1 Dec 2023 23:00:51 +0000 (15:00 -0800)]
LU-16959 lnet: auto-tune ARP-related sysctl setting

Default linux settings for net.ipv4.neigh.default.gc_thresh* may be
too low. The configuration file contains recommended threshold values
for the arp table configuration for larger systems. These values are
not set by default and can be enabled by setting the
enable_sysctl_setup parameter to 1 in the configuration file.
To activate the changes immediately please execute
sysctl -p /etc/lnet-sysctl.conf as root.
New ticket fot documentation
LUDOC-528 - Adding documentation for enable_sysctl_setup

Test-Parameters: trivial testlist=sanity-lnet env=ONLY=260
Signed-off-by: Frank Sehr <fsehr@whamcloud.com>
Change-Id: I34af4b402b59341ee7e9cfb45fef7c67eb5e78e9
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53310
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-17131 ldiskfs: Refresh suse15 sp3 series 44/52944/6
Shaun Tancheff [Mon, 26 Feb 2024 15:24:01 +0000 (22:24 +0700)]
LU-17131 ldiskfs: Refresh suse15 sp3 series

Add:
  ext4-filename-encode.patch
  ext4-add-periodic-superblock-update.patch

Update:
  ext4-encdata.patch

Test-Parameters: trivial
HPE-bug-id: LUS-11967
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Idb942ecaf7bac4e335f448885cf3836bc900f416
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52944
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 months agoLU-17583 llite: getattr/open should not revalidate dentry 54/54354/5
Etienne AUJAMES [Mon, 11 Mar 2024 17:51:57 +0000 (18:51 +0100)]
LU-17583 llite: getattr/open should not revalidate dentry

ll_getattr() and ll_intent_file_open() do not perform a lookup, it
get the attr and ldlm locks by FID (inode). So this should not
revalidate the dentry, otherwise it may produce dir cache
inconsistencies (e.g: with cwd fd).

Add a regression test: sanityn 31s, 31t

Fixes: 14ca315 ("LU-10948 llite: Revalidate dentries in ll_intent_file_open")
Fixes: 92fadf9 ("LU-15200 llite: revalidate dentry if LOOKUP lock fetched")
Test-Parameters: testlist=sanityn env=ONLY=31s,ONLY_REPEAT=20
Test-Parameters: testlist=sanityn env=ONLY=31t,ONLY_REPEAT=20
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Change-Id: Ic9823cddf37373dc95f4de3219c88c0fa0600fa7
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54354
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
4 months agoLU-18009 obd: remove o_fid_init/o_fid_fini 67/55667/2
Timothy Day [Mon, 8 Jul 2024 21:22:55 +0000 (21:22 +0000)]
LU-18009 obd: remove o_fid_init/o_fid_fini

In every case, o_fid_init is client_fid_init and o_fid_fini is
client_fid_fini. Remove these function pointers.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Idec4e5d7948b12d67f919f58b97a7119775aaf4e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55667
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 months agoLU-17000 lnet: Correctly handle args passed to jt_show_fault() 51/55651/4
Arshad Hussain [Sun, 7 Jul 2024 01:17:26 +0000 (21:17 -0400)]
LU-17000 lnet: Correctly handle args passed to jt_show_fault()

Remove 'return 0' from jt_show_fault() args processing
default case and instead set rc to -EINVAL. This correctly
takes care of bad args passed. Eg: 'lnetctl fault show -x delay'
or 'lnetctl fault show -t'. The 'rc' check deemed unnecessary
by coverity now becomes legit.

Test-Parameters: trivial testlist=sanity-lnet
CoverityID: 429592 ("Logically dead code")
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: Id1dc52218405dbd094a7e8304aafeff57b46ab79
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55651
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 months agoLU-18004 ptlrpc: shrink timeout using MIN and not MAX value 39/55639/3
Aurelien Degremont [Fri, 5 Jul 2024 13:04:19 +0000 (15:04 +0200)]
LU-18004 ptlrpc: shrink timeout using MIN and not MAX value

Change import_select_connection() to correctly use
CONNECTION_SWITCH_MIN.

When trying to set a small timeout for quick connection tests
patch v2_15_61-238-g94d05d0737 wrongly used CONNECTION_SWITCH_MAX
instead of CONNECTION_SWITCH_MIN.

Fixes: 94d05d0737 ("LU-17379 mgc: try MGS nodes faster")
Signed-off-by: Aurelien Degremont <adegremont@nvidia.com>
Change-Id: Ia85eac787441d7bef6fd47b083060bf14a8f9a31
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55639
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 months agoLU-16350 ldiskfs: removed unused ldiskfs patch 37/55637/3
Shaun Tancheff [Sat, 6 Jul 2024 01:55:16 +0000 (08:55 +0700)]
LU-16350 ldiskfs: removed unused ldiskfs patch

The ldiskfs patch:
   linux-5.18/ext4-prealloc.patch
is not used removed it.

Test-Parameters: trivial
HPE-bug-id: LUS-11376
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I628c05681366c937a2a60f1b731c4c628720a8f9
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55637
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 months agoLU-18000 kernel: update SLES15 SP5 [5.14.21-150500.55.68.1] 21/55621/2
Jian Yu [Thu, 4 Jul 2024 00:21:59 +0000 (17:21 -0700)]
LU-18000 kernel: update SLES15 SP5 [5.14.21-150500.55.68.1]

Update SLES15 SP5 kernel to 5.14.21-150500.55.68.1 for Lustre client.

Test-Parameters: trivial mdtcount=4 mdscount=2 \
  clientdistro=sles15sp5 testlist=sanity

Test-Parameters: optional clientdistro=sles15sp5 testgroup=full-part-1
Test-Parameters: optional clientdistro=sles15sp5 testgroup=full-part-2
Test-Parameters: optional clientdistro=sles15sp5 testgroup=full-part-3

Change-Id: Id88738be17f8fabe845f943c88d6428faecc63be
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55621
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Deiter
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 months agoLU-17998 kernel: update RHEL 8.10 [4.18.0-553.8.1.el8_10] 19/55619/2
Jian Yu [Thu, 4 Jul 2024 00:01:02 +0000 (17:01 -0700)]
LU-17998 kernel: update RHEL 8.10 [4.18.0-553.8.1.el8_10]

Update RHEL 8.10 kernel to 4.18.0-553.8.1.el8_10.

Test-Parameters: trivial fstype=ldiskfs mdtcount=4 mdscount=2 \
  clientdistro=el8.10 serverdistro=el8.9 testlist=sanity

Test-Parameters: trivial fstype=zfs mdtcount=4 mdscount=2 \
  clientdistro=el8.10 serverdistro=el8.9 testlist=sanity

Test-Parameters: trivial fstype=ldiskfs mdtcount=4 mdscount=2 \
  clientdistro=el8.9 serverdistro=el8.10 testlist=sanity

Test-Parameters: trivial fstype=zfs mdtcount=4 mdscount=2 \
  clientdistro=el8.9 serverdistro=el8.10 testlist=sanity

Test-Parameters: optional clientdistro=el8.10 serverdistro=el8.10 \
  testgroup=full-part-1

Test-Parameters: optional clientdistro=el8.10 serverdistro=el8.10 \
  testgroup=full-part-2

Test-Parameters: optional clientdistro=el8.10 serverdistro=el8.10 \
  testgroup=full-part-3

Change-Id: I578a3ecae6539d674b7078f08227a56a729a6e22
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55619
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Alex Deiter
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 months agoLU-17990 tests: sanity 33hh MDT index match often 11/55611/3
Frederick Dilger [Wed, 3 Jul 2024 03:52:45 +0000 (21:52 -0600)]
LU-17990 tests: sanity 33hh MDT index match often

test_33hh in sanity.sh failed likely due to random chance as
occasionally the generation of names will only contain only numbers
or only letters.

To reduce the chance of this being an issue, if the test fails it
will re-run up to 3 times internally, after which if there is still
an issue something is surely wrong and it will fail.

Test-Parameters: trivial testlist=sanity env=ONLY=33hh,ONLY_REPEAT=100
Test-Parameters: trivial testlist=sanity env=ONLY=33hh,ONLY_REPEAT=100

Signed-off-by: Frederick Dilger <fdilger@whamcloud.com>
Change-Id: I4385bd2621f1305e9c11b27f9eb67f9a45aa606a
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55611
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 months agoLU-17996 mgs: add ability to clear exports 96/55596/2
Sebastien Buisson [Tue, 2 Jul 2024 08:04:56 +0000 (10:04 +0200)]
LU-17996 mgs: add ability to clear exports

Just like with other targets (MDT, OST), give the ability to clear
dead exports from the exports list 'mgs.MGS.exports'.
Improve sanity-sec test_31 to benefit from this new ability.

Test-Parameters: trivial
Test-Parameters: testgroup=review-dne-part-2
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I4e99de31834753d223fd3cfe226f6f0343f2586b
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55596
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 months agoLU-17848 dt: allow dio_lookup/insert/delete to be optional 33/55633/2
Timothy Day [Thu, 4 Jul 2024 18:32:06 +0000 (18:32 +0000)]
LU-17848 dt: allow dio_lookup/insert/delete to be optional

Not every user of the dio API require these operations. Return
EOPNOTSUPP rather than LASSERT() if they are not implemented.

Clean up some stub functions in osp and lfsck.

Test-Parameters: trivial
Test-Parameters: trivial fstype=zfs
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I2df23b87cfca5844f8c5ca843251c463909fcd47
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55633
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 months agoLU-17848 dt: allow declare functions to be optional 32/55632/2
Timothy Day [Thu, 4 Jul 2024 17:59:19 +0000 (17:59 +0000)]
LU-17848 dt: allow declare functions to be optional

If an OSD (or other dt implementer) doesn't have anything to
declare, don't force it to implement a declare function for
an operation.

Clean up some examples of useless declare functions in osp
and lfsck.

Test-Parameters: trivial
Test-Parameters: trivial fstype=zfs
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I0d7e9f491ff2a8f6e4f3bf315a10437cd42c2351
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55632
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 months agoLU-17848 osd-zfs: use LU_TYPE_INIT_FINI() macro 09/55609/3
Timothy Day [Tue, 2 Jul 2024 23:27:02 +0000 (23:27 +0000)]
LU-17848 osd-zfs: use LU_TYPE_INIT_FINI() macro

Use LU_TYPE_INIT_FINI() macro rather than implementing the
required functions manually.

Test-Parameters: trivial
Test-Parameters: trivial fstype=zfs
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I4117d1174bc7d07b184eb16d826452b075b04ea3
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55609
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 months agoLU-17848 osd-zfs: remove osd_ladvise()/falloc() 08/55608/3
Timothy Day [Tue, 2 Jul 2024 23:14:00 +0000 (23:14 +0000)]
LU-17848 osd-zfs: remove osd_ladvise()/falloc()

These are implemented as stub functions that return EOPNOTSUPP.
Remove the functions and add a check in the corresponding dt
functions instead.

Test-Parameters: trivial
Test-Parameters: trivial fstype=zfs
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I6fad0a9ca8b07e3d09701e71773dc896a3845b9e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55608
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 months agoLU-17848 osd: remove osd_invalidate() for ldiskfs/ZFS 07/55607/3
Timothy Day [Tue, 2 Jul 2024 23:06:12 +0000 (23:06 +0000)]
LU-17848 osd: remove osd_invalidate() for ldiskfs/ZFS

This is implemented as a stub function that returns 0.
Remove the implementations from the OSD and add a check into
dt_invalidate().

Test-Parameters: trivial
Test-Parameters: trivial fstype=zfs
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Ieee086218dc83c3129bc572689a14c79c981bcb7
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55607
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 months agoLU-17848 osd: remove osd_check_stale() for ldiskfs/ZFS 06/55606/3
Timothy Day [Tue, 2 Jul 2024 22:54:20 +0000 (22:54 +0000)]
LU-17848 osd: remove osd_check_stale() for ldiskfs/ZFS

This is implemented as a stub function that returns false.
Remove the implementations from the OSD and add a check into
dt_check_stale().

Test-Parameters: trivial
Test-Parameters: trivial fstype=zfs
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Id7fb2c1d1600a3dcc5d278cb2dab5d65a10bdefd
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55606
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 months agoLU-17165 tests: fix recovery-small test_141 91/55591/2
Sebastien Buisson [Mon, 1 Jul 2024 09:01:54 +0000 (11:01 +0200)]
LU-17165 tests: fix recovery-small test_141

Wait for import generation change before counting the locks when the
MGS has been restarted. And to make things clearer, check lock count
on OST side.

Test-Parameters: trivial
Test-Parameters: mdscount=2 mdtcount=4 osscount=1 ostcount=8 clientcount=2 testlist=recovery-small env=ONLY=141,ONLY_REPEAT=20
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Ie9526aa38e3a669b7865516a296dfeed438a83f3
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55591
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Deiter
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 months agoLU-17985 osd-ldiskfs: drop osd object if failed to create 71/55571/4
Hongchao Zhang [Fri, 21 Jun 2024 21:51:31 +0000 (05:51 +0800)]
LU-17985 osd-ldiskfs: drop osd object if failed to create

In osd_create, if the newly created inode had already contained
correct XATTR_NAME_LMA but failed to update the OI, it will clear
osd_object->oo_inode, the osd_object should also be dropped.

Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Change-Id: I4ff5952c154ce459c78514b88b1810471635c703
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55571
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
4 months agoLU-14094 tests: improve sanity.sh test_311 66/55566/6
Emoly Liu [Fri, 28 Jun 2024 17:12:29 +0000 (01:12 +0800)]
LU-14094 tests: improve sanity.sh test_311

Improve sanity.sh test_311 to see why the number of the objects
doesn't decrease as expected.

Test-Parameters: trivial testlist=sanity env=ONLY=311,ONLY_REPEAT=200
Test-Parameters: trivial testlist=sanity env=ONLY=311,ONLY_REPEAT=200
Test-Parameters: trivial testlist=sanity env=ONLY=311,ONLY_REPEAT=200
Test-Parameters: trivial testlist=sanity env=ONLY=311,ONLY_REPEAT=200
Test-Parameters: trivial testlist=sanity env=ONLY=311,ONLY_REPEAT=200
Signed-off-by: Emoly Liu <emoly@whamcloud.com>
Change-Id: Iabbaed42c5654ef31bc9f98fe9868785f8ff2f18
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55566
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Zhenyu Xu <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 months agoLU-17984 lnet: Remove the correct state on failure 52/55552/2
Shaun Tancheff [Thu, 27 Jun 2024 10:49:22 +0000 (17:49 +0700)]
LU-17984 lnet: Remove the correct state on failure

On cpu init a failure to setup CPUHP_AP_ONLINE_DYN should
remove the previously setup state CPUHP_BP_PREPARE_DYN

CPUHP_AP_ONLINE_DYN should be CPUHP_BP_PREPARE_DYN

Test-Parameters: trivial
Fixes: 6d27c2c8c72 ("LU-17592 build: compatibility updates for kernel 6.8")
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Ic9fc9dd4e798be3a0db65092e2b8e545ec5d4687
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55552
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Caleb Carlson <caleb.carlson@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 months agoLU-9119 lnet: remove "struct' from generated comment 07/55507/2
Olaf Weber [Fri, 27 Jan 2017 15:17:01 +0000 (16:17 +0100)]
LU-9119 lnet: remove "struct' from generated comment

The CHECK_STRUCT() generates a comment saying
"Checks for struct " followed by the type name.
If the type name is 'struct mumble' the result
is "Checks for struct struct mumble". Drop the
extra "struct".

Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Olaf Weber <olaf@sgi.com>
Signed-off-by: Olaf Weber <olaf.weber@hpe.com>
Change-Id: I90b13a2c500c63accb90ef567b197defd5521dea
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55507
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 months agoLU-930 doc: document no_create mount option 03/55503/3
Andreas Dilger [Fri, 21 Jun 2024 22:15:33 +0000 (16:15 -0600)]
LU-930 doc: document no_create mount option

Add the "-o no_create" mount option to the mount.lustre.8 man page.

Test-Parameters: trivial
Fixes: 1dbcd0bab8 ("LU-12998 mds: add no_create parameter to stop creates")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I143f46f71fdcff8ce320861e7ade0f7a9a1f96f7
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55503
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Maximilian Dilger <mdilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 months agoLU-17460 lnet: properly handle net_device referencing 82/55582/5
James Simmons [Mon, 8 Jul 2024 20:20:52 +0000 (16:20 -0400)]
LU-17460 lnet: properly handle net_device referencing

Most of the time LNet uses __in[6]_dev_get_xxx() which does no
reference changes. The one expection is the use of dev_get_by_index()
called in lnet_create_socket(). Replace dev_get_by_index() with
dev_get_by_index_rcu(). Also examined the code to make sure the
right type of locking was done. If we use rcu locking we should
use for_each_netdev_rcu() so update ksocknal_ip2index().

Test-Parameters: trivial testlist=sanity-lnet
Fixes: e4fa181abf1 ("LU-10391 lnet: allow creation of IPv6 socket.")
Fixes: 09c6e2b8722 ("LU-16836 lnet: ensure dev notification on lnd startup")
Change-Id: I0c496652553318bd0e47fa1e03d6e631fd8421bb
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55582
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 months agoLU-15053 tests: sanity-quota_13 fix 97/55197/7
Sergey Cheremencev [Fri, 24 May 2024 16:28:47 +0000 (19:28 +0300)]
LU-15053 tests: sanity-quota_13 fix

Scope a case when there are any extra users
with quota limit and non zero usage besides
TSTUSR and TSTUSR2. This is possible when
tests are started with ENABLE_QUOTA=yes.
In a such case each user may have a lock between
OST and QMT depending. Take this into account
in sanity-quota_13. Fix with following failure:

  sanity-quota test_13: @@@@@@ FAIL: 2 cached locks

Test-Parameters: trivial testlist=sanity-quota
Test-Parameters: testlist=sanity-quota env=ONLY=13,ONLY_REPEAT=100
Signed-off-by: Sergey Cheremencev <scherementsev@ddn.com>
Change-Id: Iaf48d0eb80eef0fc5ebc8246e8fac3f9c96563c0
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55197
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 months agoLU-17629 utils: support hostname with 94/54894/9
James Simmons [Wed, 26 Jun 2024 17:43:35 +0000 (13:43 -0400)]
LU-17629 utils: support hostname with
 lustre_lnet_parse_nid_range()

For a hostname it's possible it maps to multiple IPs. In
this case lnetctl commands that attempt to use the hostname
can resolve to the wrong IP address. Update the function
lustre_lnet_parse_nid_range() to work with hostnames and
properly resolve the correct IP address. Update both
lnetctl ping and lnetctl discover to work with
lnet_parse_nid_range().

Test-Parameters: trivial testlist=sanity-lnet
Change-Id: I670799edcb04a02380e96c289ba26854b057d978
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54894
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 months agoLU-17722 tests: trim tmpfs from wait_delete_completed() 20/54720/10
Alex Zhuravlev [Wed, 10 Apr 2024 12:27:22 +0000 (15:27 +0300)]
LU-17722 tests: trim tmpfs from wait_delete_completed()

to release unused ram

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Idcd4d15e0f56184e1d1897f3a64d5b62baaf7edb
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54720
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Deiter
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 months agoLU-10499 pcc: add stats for attach|detach|auto_attach 18/54418/7
Qian Yingjin [Fri, 27 Aug 2021 03:46:13 +0000 (11:46 +0800)]
LU-10499 pcc: add stats for attach|detach|auto_attach

In this patch, we add stats for PCC attach, detach and
auto_attach.
With this feature, we verify that PCC can auto-attach the file
into PCC cache without having to re-fetch the data of the whole
file.
Add sanity-pcc test_44.

EX-bug-id: EX-3715
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: Ia0c1cd6b414998e72859aaf34c125b5a4e4e743c
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54418
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 months agoLU-10499 pcc: avoid to specify ID for every attach 14/54414/6
Qian Yingjin [Tue, 8 Jun 2021 09:54:49 +0000 (17:54 +0800)]
LU-10499 pcc: avoid to specify ID for every attach

In this patch, it avoids the need to specify "-i <attach_id>" for
every attach as in the very common case there is only a single
cache for that client.
If attach ID is not specified, it will select the first dataset
on the client as PCC backend.

And the new format of PCC state is as follows:
file: /mnt/lustre/f42.sanity-pcc, type: readonly, PCC_file:
/d42.sanity-pcc/0402/0x200000401:0x3:0x0, open_count: 0, flags: 0

EX-3752 pcc: show attaching state for PCC state output

When set llite.*.pcc_async_threshold=0, the client will do PCC
attach in asynchronous way.
When the file is large, attaching the file into PCC may take some
time.
In this patch, we improve that output of the PCC command
"lfs pcc state" to show that the file is in PCC attaching state
when the file is still in the phase of copying from Lustre OSTs
to PCC.
Was-Change-Id: I101d87638f5afac41fb4f55b4aaf95d938bc8ccd

EX-bug-id: EX-3292 EX-3752
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: Icd23eda5dca4711f9bb7af940f6cef5ddb97ce69
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54414
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 months agoLU-10499 pcc: avoid dead lock for auto attach in PCC-RO 90/54390/9
Qian Yingjin [Wed, 12 May 2021 03:43:28 +0000 (11:43 +0800)]
LU-10499 pcc: avoid dead lock for auto attach in PCC-RO

In this patch, It releases the pcc inode lock when calling
ll_layout_refresh() in @pcc_try_auto_attach() as it may cause the
following deadlock:
1. The client is writing or truncating a file in readonly mode.
   At this time, it will send a write layout intent lock to clear
   the readonly state on the layout on MDT.
2. A read process tries to auto attach the file with pcc inode
   lock hold. During the pregress of auto attach, it will call
   ll_layout_refresh(). The client-side enqueue request for a
   layout lock returned a blocked lock, it will sleep and wait for
   the lock being granted;
3. MDT will take EX layout lock to cancel all cached layout lock
   on client to change the layout for clearing the PCC-RO state.
4. when the client handles the revocation of layout lock, it needs
   to invalidate the PCC state which needs under the protection of
   pcc inode lock.

EX-3191 pcc: add test for mmap | write | detach racer

This patch adds the mmap racer among: (write | read | mmap_cat |
detach | unlink): sanity-pcc/test_99.
Was-Change-Id: I5db160851a95937275fea6ae32f40dcd0fe69f46

EX-3478 pcc: avoid uninitialized pcc mutext lock in cleanup

Running racer concurrently crashed in the following way:
  RIP: 0010:[...]  [...] __list_add+0x1b/0xc0
  __mutex_lock_slowpath+0xa6/0x1d0
  mutex_lock+0x1f/0x2f
  pcc_inode_free+0x1e/0x60 [lustre]
  ll_clear_inode+0x64/0x6a0 [lustre]
  ll_delete_inode+0x5d/0x220 [lustre]
  evict+0xb4/0x180
  iput+0xfc/0x190
  ll_iget+0x156/0x350 [lustre]
  ll_prep_inode+0x212/0x9b0 [lustre]

After analysis, we found that the mutex @lli_pcc_lock is not
initialized. The reason is that ll_lli_init() is not called to
initialize @lli.
When call pcc_inode_free(), it will call mutex_lock() on the
uniniitialized @lli_pcc_lock, thus crash the kernel.

In liblustreapi_pcc.c, it should set errno on error return.
Was-Change-Id: I612c79a5b8eb4fa9daeb9e446a457e95c666c04a

EX-3636 pcc: reset file mmaping for the file once mmaped

For a file once mmaped and cached on PCC, a new open will set the
mapping for the file handle of PCC copy (@file->f_mapping) with
the one of the Lustre file handle. When the file is detached from
PCC due to manual detach or layout lock shrinking, the normal I/O
(read/write) will auto-attach the file into PCC again during I/O
as the layout version is unchanged. However, it still needs to
reset the file mapping (@pcc_file->f_mapping) with the mapping of
the PCC copy. Otherwise it will cause panic as follows:
[  935.516823] RIP: 0010:_raw_read_lock+0xa/0x20
[  935.517077]  ll_cl_find+0x19/0x60 [lustre]
[  935.517098]  ll_readpage+0x51/0x820 [lustre]
[  935.517110]  read_pages+0x122/0x190
[  935.517119]  __do_page_cache_readahead+0x1c1/0x1e0
[  935.517131]  ondemand_readahead+0x1f9/0x2c0
[  935.517142]  pagecache_get_page+0x30/0x2c0
[  935.517165]  generic_file_buffered_read+0x556/0xa00
[  935.517189]  pcc_try_auto_attach+0x3ac/0x400 [lustre]
[  935.517552]  pcc_io_init+0x146/0x560 [lustre]
[  935.517906]  pcc_file_read_iter+0x24d/0x2b0 [lustre]
[  935.518259]  ll_file_read_iter+0x74/0x2e0 [lustre]
[  935.518604]  new_sync_read+0x121/0x170
[  935.518937]  vfs_read+0x8a/0x140

This patch adds sanity-pcc test_98 to verify it.

I/O for a file previously opened before attach into PCC or once
opened while in ATTACHING state will fallback to Lustre OSTs.
For the later mmap() on the file, the mmap() I/O also needs to
fallback to Lustre OSTs and cannot read directly from local valid
cached PCC copy until all fallback file handles are closed as the
mapping of the PCC copy is replaced with the one of Lustre file
when mmapped a file.
Add sanity-pcc test_97 to verify it.

And we also forbid to auto attach the file which is still in
mmapped I/O.

EX-3636 pcc: auto attach should skip if already attached

When try to auto attach a file into PCC, if found that the file
had already attached into PCC, it should skip the auto attach
processing. Otherwise, it will result in wrong PCC inode refcount
when multiple threads try to auto attach a file at the same time.

For a file once mmapped into PCC and detached due to layout lock
shrinking or manual detach command, If found that file is still
valid cached (attach into PCC again by another thread), in the
@pcc_mmap_io_init(), it should set the mapping of PCC copy with
the one of Lustre file again.
Was-Change-Id: I5f049ca7d6db8708712e79e9ad459fc60b80f2be

LU-17964 pcc: set mapneg bit in all cases of normal I/O fallback

When a file is copying data from Lustre OSTs to the PCC copy, the
file is in PCC ATTACHING state. New opens and I/O on this file
will fallback to the normal I/O path (Lustre OSTs) before the
attach is finished. And the file handle will be set with fallback
and mapneg bit. Currently we only clear the fallback and mapneg
bit when the file handle is closed.

To support mmap() I/O, we replace the mapping of the PCC copy with
the one of the Lustre file. However, we can do that only if the
Lustre file has not any opened file handle with mapneg bit set.
Otherwise, we can not switch the mapping and the mmap() I/O will
also fallback to Lustre OSTs and use the mapping of the Lustre
file.

Once a mmap()ed file was detached from PCC backend due to the
manual detach command or the revocation of the LAYOUT ibit lock
(which protects the cache validity of PCC cache access), we should
reset the mapping of the PCC file accordingly and set fallback and
mapneg bits if the I/O is falling back into the normal path
(Lustre OSTs).
Was-Change-Id: Ibd152aaf724dcff48efbe022dc7f3e70848b4e0d

EX-bug-id: EX-3080 EX-3191 EX-3478 EX-3480 EX-3636
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I18890d19d03726a5991c923505e8c5363382fdc2
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54390
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 months agoLU-16350 ldiskfs: Server support for linux v6.7 / Ubuntu 24.04 16/54216/10
Shaun Tancheff [Sat, 6 Jul 2024 01:54:39 +0000 (08:54 +0700)]
LU-16350 ldiskfs: Server support for linux v6.7 / Ubuntu 24.04

Exclude kunit tests [files matching *-test.c] from ldiskfs build

Updated patch series for Linux v6.7:
  ext4-corrupted-inode-block-bitmaps-handling-patches.patch
  ext4-ialloc-uid-gid-and-pass-owner-down.patch

Updated patch series for Linux v6.5:
   ext4-data-in-dirent.patch

Change struct osd_it_ea_dirent.oied_name from zero length
to flexible array so strncmp works as expected.

Test-Parameters: trivial
HPE-bug-id: LUS-11376
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I2b2325a5874a91096fbd63750096e459065668bc
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54216
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Shuichi Ihara <sihara@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>