Whamcloud - gitweb
fs/lustre-release.git
14 months agoLU-13569 lnet: Only recover known good peer NIs
Chris Horn [Thu, 16 Jul 2020 03:38:52 +0000 (22:38 -0500)]
LU-13569 lnet: Only recover known good peer NIs

A peer NI should not be eligible for recovery if we've never
received a message from it.

Lustre-change: https://review.whamcloud.com/39719
Lustre-commit: 39a169cd02738a13866f3b88fbe3304dc20565d6

Test-Parameters: trivial
HPE-bug-id: LUS-9109
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Iec2fd015f6410ab91c6ef7c222cbed0204243106
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54401
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
14 months agoLU-13569 lnet: Age peer NI out of recovery
Chris Horn [Sun, 23 Aug 2020 15:14:22 +0000 (10:14 -0500)]
LU-13569 lnet: Age peer NI out of recovery

No longer send recovery pings to a peer NI that has been in recovery
for the recovery time limit. A peer NI will become eligible for
recovery again once we receive a message from it.

The existing lpni_last_alive field is utilized for this new purpose.

A check for NULL lpni is removed from
lnet_handle_remote_failure_locked() because all callers of that
function already ensure the lpni is non-NULL.

lnet_peer_ni_add_to_recoveryq_locked() now takes the recovery queue
as an argument rather than using the_lnet.ln_mt_peerNIRecovq. This
allows the function to be used by lnet_recover_peer_nis().
lnet_peer_ni_add_to_recoveryq_locked() is also modified to take a ref
on the peer NI if it is added to the recovery queue. Previously, it
was the responsibility of callers to take this ref.

Lustre-change: https://review.whamcloud.com/39718
Lustre-commit: cc27201a76574b51dc3ffb37f039b3364cab386d

Test-Parameters: trivial
HPE-bug-id: LUS-9109
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Ib4676540ac4bb040690a4fb047236c54eea0e752
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54400
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
14 months agoLU-10465 test: fix interop test for 4M default stripe size
Andreas Dilger [Thu, 23 Jan 2020 20:15:10 +0000 (20:15 +0000)]
LU-10465 test: fix interop test for 4M default stripe size

New servers could use 4MiB default stripe size, so some of
the tests need to use bigger component extent or specify stripe size
explicitly to accommodate enough stripe count.

Patch includes several test fixes:
- sanity-pfl: takes into account stripe size in some tests
- sanity-flr: use bigger component size and amount of data to
  saturate all stripes as expected by test
- sanity: 130g to use 1M stripe prior FIEMAP calcs
- sanity-lfsck: 36[a-c] to use 1M stripe as expected by calcs

This change for test scripts comes from:

Lustre-change: https://review.whamcloud.com/37318
Lustre-commit: ea18d7da59d369f093e340e150544f51b2f229a1

Test-Parameters: testlist=sanity-flr serverversion=2.15 env=ONLY="0 208"
Test-Parameters: testlist=sanity-pfl serverversion=2.15 env=ONLY="0 1 14 19 20 24"
Test-Parameters: testlist=sanity serverversion=2.15 env=ONLY="27 130"
Fixes: ee7dfc5ad1 ("LU-17025 llapi: Verify stripe pool name")
Fixes: 0396310692 ("LU-15727 lod: honor append_pool with default composite layouts")
Fixes: b384ea39e5 ("LU-14480 pool: wrong usage with ost list")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I3cef8805247fc5253e0a0ac05157b9d609054df9
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54444
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
14 months agoEX-9372 lpurge: rid of dup layout get
Bobi Jam [Fri, 22 Mar 2024 07:49:21 +0000 (15:49 +0800)]
EX-9372 lpurge: rid of dup layout get

Delete the unnecessary lpurge_mirror_delete() repeat layout getting
code, and update fd/fdv open flags align with LU-14677 to allow
layout operations on encrypted files, even when the encryption key
is not available.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: If4e6e3a9c46ab782f08e5bc5b3710339af051719
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54527
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Alexandre Ioffe <aioffe@ddn.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexandre Ioffe <aioffe@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
14 months agoLU-14535 quota: improve quota output format
Andreas Dilger [Sat, 23 Mar 2024 01:28:42 +0000 (18:28 -0700)]
LU-14535 quota: improve quota output format

Make the output of the quota proc files more readable by removing
needless whitespace that causes it to wrap over a single line.

Lustre-change: https://review.whamcloud.com/43099
Lustre-commit: cd1847e73e5990ef797280846dc27fc0b0a876e9

[NOTE: missing changes to procfs output that change field names]

Test-Parameters: trivial testlist=sanity-quota
Test-Parameters: testlist=sanity-quota serverversion=EXA5
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I80fbd42dea865dff1d106724dbf69946d23ebbe5
Reviewed-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/46332
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
14 months agoEX-9387 lipe: Update tests in lipe_scan3
Vitaliy Kuznetsov [Sun, 17 Mar 2024 22:35:12 +0000 (23:35 +0100)]
EX-9387 lipe: Update tests in lipe_scan3

This short patch fixes the lipe_scan3 and lipe_find3
automated tests. Adds new attributes to tests to
run lipe_scan3

Fixes: 9093217a72 ("EX-8130 lipe: Link a new statistics module")
Test-Parameters: trivial testlist=sanity-lipe-scan3,sanity-lipe-find3
Signed-off-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Change-Id: Ibe00b2eb75da00ce4a8dbfd7a4f4d74b365104de
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54440
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
14 months agoEX-9125 tests: get stats_compr param in sanity-compr/1007
Jian Yu [Mon, 5 Feb 2024 21:02:18 +0000 (13:02 -0800)]
EX-9125 tests: get stats_compr param in sanity-compr/1007

This patch adds getting stats_compr param in sanity-compr
test 1007 and 1008 to track data compression statistics.

Test-Parameters: trivial testlist=sanity-compr env=ONLY="1007 1008",ONLY_REPEAT=3

Change-Id: I7df25d36c8c89ec5a568bf1e8d5694a97dd18ecc
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53927
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
14 months agoLU-17434 lmv: add exclude list for remote dir
Lai Siyao [Tue, 16 Jan 2024 19:18:30 +0000 (14:18 -0500)]
LU-17434 lmv: add exclude list for remote dir

Apache Spark creating a _temporary subdirectory for staging files, and
it should be created on the same MDT as its parent directory. Add a
tunable lmv.*.qos_exclude_prefixes, if directory prefix is in this
list, lmv_create() should put it on its parent MDT.

This prefix list follows the same rule of shell environment PATH: use
':' as separator for prefixes. And for convenience '+/-' can be used
to add/remove prefixes.

Add sanity 413k.

Lustre-change: https://review.whamcloud.com/53780
Lustre-commit: 5b07dce19b1830769d7a1f7bba8b559d3ead9dfb

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I4c8a118f0630c19054934a87bee3599bdb1fe7bb
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54462
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
14 months agoLU-17632 o2iblnd: graceful handling of CM_EVENT_CONNECT_ERROR
Serguei Smirnov [Mon, 11 Mar 2024 17:59:29 +0000 (10:59 -0700)]
LU-17632 o2iblnd: graceful handling of CM_EVENT_CONNECT_ERROR

There were examples in the field with RoCE setups which demonstrate
that RDMA_CM_EVENT_CONNECT_ERROR may be received when conn state
is neither IBLND_CONN_ACTIVE_CONNECT nor IBLND_CONN_PASSIVE_WAIT.
Handle this in a more gracious manner: report the event as unexpected
and allow the flow to continue.

Lustre-change: https://review.whamcloud.com/54353
Lustre-commit: 7f27a2fceef9a03d3ada74e258e774c8f5d420f0

Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: I58b2482207cfd821f6eac142bdefc8f5bc50f8b4
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54362
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoRM-620 build: New tag 2.14.0-ddn139
Andreas Dilger [Sat, 16 Mar 2024 08:21:46 +0000 (02:21 -0600)]
RM-620 build: New tag 2.14.0-ddn139

New tag 2.14.0-ddn139

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I464d0618296d59093c0f85075daf3cdb653665e8

15 months agoRM-620 build: New tag lipe-2.44
Andreas Dilger [Sat, 16 Mar 2024 08:21:38 +0000 (02:21 -0600)]
RM-620 build: New tag lipe-2.44

New tag lipe-2.44

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ife8844a38d83a63175c5cf0182e1d23405143459

15 months agoEX-6247 lfsck: support compression in filter_fid
Hongchao Zhang [Fri, 8 Mar 2024 02:45:02 +0000 (10:45 +0800)]
EX-6247 lfsck: support compression in filter_fid

The OST object compression info has been stored in filter_fid
by EX-8038, the LFSCK also needs to handle the compression info
during processing the filter_fid.

Fixes: bf4f5295e594 ("EX-8038 csdc: expand filter_fid")
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Change-Id: Id050214ba0057776a05c194dc9222117c5d7fd8b
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54158
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-14869 test: improve sanity-flr/200a
Bobi Jam [Fri, 8 Mar 2024 07:07:59 +0000 (15:07 +0800)]
LU-14869 test: improve sanity-flr/200a

Make sure "flock -x" successfully returned before running mirror
resync so that it won't get into running read holding shared flock.

Lustre-change: https://review.whamcloud.com/54345
Lustre-commit: TBD (from 2ff02fec932ab6e00a4942517db76f48412db31a)

Test-Parameters: trivial testlist=sanity-flr env=ONLY=200a,ONLY_REPEAT=10
Test-Parameters: trivial testlist=sanity-flr env=ONLY=200a,ONLY_REPEAT=10
Test-Parameters: trivial testlist=sanity-flr env=ONLY=200a,ONLY_REPEAT=10
Test-Parameters: trivial testlist=sanity-flr env=ONLY=200a,ONLY_REPEAT=10
Test-Parameters: trivial testlist=sanity-flr env=ONLY=200a,ONLY_REPEAT=10
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I6383af5d5761980d24af19efd4a4ac899f369a7d
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54328
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-14971 test: align mirror_io resync implementation
Bobi Jam [Tue, 12 Oct 2021 12:02:01 +0000 (20:02 +0800)]
LU-14971 test: align mirror_io resync implementation

Align the mirror_io resync implementation with
llapi_mirror_resync_many().

Lustre-change: https://review.whamcloud.com/45202
Lustre-commit: 9898ae5347616a097193f69811a74af3ddd88349

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Icf11c4c2302f36fc0f9682e0a310058081e1214f
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54327
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
15 months agoEX-9371 lipe: print FID after llapi_lease_set() error
Alexandre Ioffe [Sat, 9 Mar 2024 04:48:03 +0000 (20:48 -0800)]
EX-9371 lipe: print FID after llapi_lease_set() error

Investigate 'cannot get UNLOCK lease, ext 8: Invalid argument (22)'
Add error message and print FID when llapi_lease_set() fails.

Test-Parameters: trivial testlist=hot-pools
Signed-off-by: Alexandre Ioffe <aioffe@ddn.com>
Change-Id: Id498f2017ccdf8d8896ce731cc65ca45ed692d62
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54339
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-17566 mdt: remove duplicate call to mdt_init_ucred_reint()
Aurelien Degremont [Tue, 20 Feb 2024 11:46:03 +0000 (12:46 +0100)]
LU-17566 mdt: remove duplicate call to mdt_init_ucred_reint()

Remove duplicate call to mdt_init_ucred_reint() from
mdt_reint_setxattr().

mdt_init_ucred_reint() is called in mdt_reint_internal() which is
covering all actual reinters. However, SETXATTR was converted to
reinters framework in fd908da and this call was not removed.
So mdt_init_ucred_reint() is called first in mdt_reint_internal() then
again in the specific mdt_reint_setxattr() handler, without anything
special being done on the ucred between them.

Also merge __mdt_init_ucred() and mdt_init_cred() which was
called only once, and with the same prototype.

Lustre-change: https://review.whamcloud.com/54111
Lustre-commit: 65e0802f2ada98f802d01b5672bb9349ad0dde8c

Signed-off-by: Aurelien Degremont <adegremont@nvidia.com>
Change-Id: I90fed1d2709edf7337a27dd9c3cb0f75f7625135
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54368
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
15 months agoLU-17175 gss: start lsvcgssd from l_getauth
Sebastien Buisson [Wed, 15 Nov 2023 10:22:13 +0000 (11:22 +0100)]
LU-17175 gss: start lsvcgssd from l_getauth

If l_getauth detects it cannot connect to the socket supposed
to be opened by lsvcgssd, it tries to launch the daemon, with
predefined default values.

Lustre-change: https://review.whamcloud.com/53142
Lustre-commit: 414467762f8a034c72903bab8ebfce6e1feb8e79

Test-Parameters: trivial
Test-Parameters: kerberos=true testlist=sanity-krb5
Test-Parameters: testgroup=review-dne-selinux-ssk-part-2
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I3961ce0f548fb6ea23458edcb01a03fb8b3a617f
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54369
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
15 months agoEX-9400 sec: refresh entry directly
Yang Sheng [Tue, 5 Mar 2024 03:11:05 +0000 (11:11 +0800)]
EX-9400 sec: refresh entry directly

We can process entry directly for INTERNAL entry. It avoid
to release & take lock frequently.

Fixes: fb0082bba1 ("EX-4333 sec: support supplementary groups from client")
Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: I2af9c3964978c842dac8f70ad814adb529dff39f
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54273
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
15 months agoEX-9280 lipe: lpurge: Add periodical stats on scanned files
Alexandre Ioffe [Sat, 9 Mar 2024 00:32:04 +0000 (16:32 -0800)]
EX-9280 lipe: lpurge: Add periodical stats on scanned files

- Report INFO message on each lpurge period.
- Include in the message number of purged files, number of files
for which mirror deleting failed, number of files
which are not purged due to a stale component.
- Reset the periodic counter every period.

Test-Parameters: trivial testlist=hot-pools
Signed-off-by: Alexandre Ioffe <aioffe@ddn.com>
Change-Id: Ib70604c990801b79b0a0356991d45e83ec62db6c
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54337
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-17578 lnet: fix &the_lnet.ln_mt_peerNIRecovq race
Bruno Faccini [Fri, 23 Feb 2024 12:16:36 +0000 (13:16 +0100)]
LU-17578 lnet: fix &the_lnet.ln_mt_peerNIRecovq race

To avoid race &the_lnet.ln_mt_peerNIRecovq must always be
accessed with lnet_net_lock(0) protection.

Lustre-change: https://review.whamcloud.com/54163
Lustre-commit: 0a0e881d8884a220c485c0384351da12dc8aed9f

Test-Parameters: trivial
Fixes: da23037 ("LU-16563 lnet: use discovered ni status to set initial health")
Change-Id: Ic5e0194020200afdecba4cbf5afed274b14da388
Signed-off-by: Bruno Faccini <bfaccini@nvidia.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54382
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-17522 build: Distribute clang build infrastructure
Shaun Tancheff [Mon, 11 Mar 2024 17:42:00 +0000 (10:42 -0700)]
LU-17522 build: Distribute clang build infrastructure

Macro files:
    lustre-toolchain.m4 lustre-compiler-plugins.m4
and directory:
   cc-plugins

Should be included in distributed files, unconditionally.

Lustre-change: https://review.whamcloud.com/53991
Lustre-commit: 881cc6384a5e37f0e64b56c1b34563b87dcc210d

Test-Parameters: trivial
Fixes: d684885098 ("LU-16961 clang: plugins and build system integration")
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I6ddedd82c6180ffd1c4134fda6af6df6bd23dd34
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54351
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-16961 clang: plugins and build system integration
Timothy Day [Mon, 11 Mar 2024 17:38:04 +0000 (10:38 -0700)]
LU-16961 clang: plugins and build system integration

Clang has a plugin system. Compiler extensions can be created
by making a shared library and loading it via the "-fplugin"
options. This makes it simple to implement custom warnings
and static analyzers.

This patch adds a plugin to detect functions that should have
been made static. This plugin has been run over the majority
of the Lustre tree and patches have been submitted for all
warnings. The plugin did not return any false positives in
my testing.

It also add the "--enable-compiler-plugins" configure option,
which automatically builds and sets up the in-tree C compiler
plugins. The option force-enables the plugin regardless of
which compiler is in use. This behavior could be changed if
there is ever a need to support GCC specific plugins.

Also, add the configure checks needed to support building C++
in the Lustre tree. Clang and GCC plugins (and the compilers
themselves) are written in C++.

The license for the plugin mirrors that of the LLVM project
itself. This leaves the door open for contributing this
plugin upstream in the future. This isn't being upstreamed
now because it lacks any significant user community. Hence,
the plugin does not appear to meet the requirements for
upstreaming based on https://clang.llvm.org/get_involved.html.

Lustre-change: https://review.whamcloud.com/51659
Lustre-commit: d684885098c40fee2951feb410bec739717ac9bc

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I747ed91b53e765cc58e91a3eb9ec6c12b9908a96
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54350
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-17243 pcc: replace i_mtime with inode_get_mtime()
Jian Yu [Tue, 12 Mar 2024 06:42:56 +0000 (23:42 -0700)]
LU-17243 pcc: replace i_mtime with inode_get_mtime()

This patch replaces i_mtime with inode_get_mtime(), and
i_mtime.tv_sec with inode_get_mtime_sec() in pcc codes.

Test-Parameters: trivial testlist=sanity-pcc

Fixes: 3c586ca ("LU-17243 build: compatibility updates for kernel 6.6")
Change-Id: I756451ac9d38c3b434bb511b33d5d891b2b914ae
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54357
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Feng Lei <flei@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
15 months agoLU-17627 build: fix new mofed version
Minh Diep [Wed, 6 Mar 2024 02:26:58 +0000 (18:26 -0800)]
LU-17627 build: fix new mofed version

Allow multi-digit MOFED version numbers.
Fix compare_version function to return what it should

Lustre-change: https://review.whamcloud.com/54336
Lustre-commit: TBD (from ec967a35b2ac09e780772bdfd4365f6fc7308417)

Change-Id: I0f585cb355bb34270003ae1139688080c301186a
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54289
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Alex Deiter
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
15 months agoLU-17179 tests: check the system is clean
Sergey Cheremencev [Mon, 9 Oct 2023 02:45:16 +0000 (06:45 +0400)]
LU-17179 tests: check the system is clean

Main part of tests cannot work correctly if the system
is not clean. So check this in the beginning of sanity-quota.

Lustre-change: https://review.whamcloud.com/52630
Lustre-commit: 7e1fb1a296ec7ab21be7ec39e2b6a38fbca76b6c

Test-Parameters: trivial testlist=sanity-quota,sanity-quota,sanity-quota
Test-Parameters: testlist=sanity-quota,sanity-quota mdscount=2 mdtcount=4
Signed-off-by: Sergey Cheremencev <scherementsev@ddn.com>
Change-Id: Ibfbe4663dee8476486e96eb99ccbcea13216861b
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54392
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
15 months agoLU-17602 mdd: use correct fid in mdd_rename
Alex Zhuravlev [Mon, 4 Mar 2024 04:32:04 +0000 (07:32 +0300)]
LU-17602 mdd: use correct fid in mdd_rename

mdd_rename() can re-insert target name back as a part of error
handling. use correct fid for that, not own target directory fid.

Lustre-change: https://review.whamcloud.com/54260
Lustre-commit: TBD (from 61a389ea659ce62790219b0d6edf72730284b007)

Fixes: 1c03346731 ("LU-17016 mdd: no EXDEV for parent dir projid mismatch")
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I0662fa005459416b070157a2d049fcf5ed08ae91
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54344
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoRM-620 build: New tag 2.14.0-ddn138
Andreas Dilger [Sat, 9 Mar 2024 07:49:05 +0000 (00:49 -0700)]
RM-620 build: New tag 2.14.0-ddn138

New tag 2.14.0-ddn138

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I8320f67d93cc082c82597d6591eedf54a147b6be

15 months agoRM-620 build: New tag lipe-2.43
Andreas Dilger [Sat, 9 Mar 2024 07:48:48 +0000 (00:48 -0700)]
RM-620 build: New tag lipe-2.43

New tag lipe-2.43

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I787e91686d0a10e34bbf855db7f02faa6a75e281

15 months agoLU-15274 llite: whole file read fixes
Patrick Farrell [Mon, 12 Feb 2024 16:56:00 +0000 (11:56 -0500)]
LU-15274 llite: whole file read fixes

There are two significant issues with whole file read.

1. Whole file read does not interact correctly with fast
reads - specifically, whole file read is not recognized by
the fast read code so files below the
"max_read_ahead_whole_mb" limit will not use fast reads.
This has a significant performance impact.

2. Whole file read does not start from the beginning of the
file, it starts from the current IO index.  This causes
issues with unusual IO patterns, and can also confuse
readahead more generally (I admit to not fully understanding
what happens here, but the change is reasonable regardless.)
This is particularly important for cases where the read
doesn't start at the beginning of the file but still reads
the whole file (eg, random or backwards reads).

Performance data:
max_read_ahead_whole_mb defaults to 64 MiB, so a 64 MiB
file is read with whole file, and a 65 MiB file is not.

Without this fix:
rm -f file
truncate -s 64M file
dd if=file bs=4K of=/dev/null
67108864 bytes (67 MB, 64 MiB) copied, 7.40127 s, 9.1 MB/s

rm -f file
truncate -s 65M file
dd if=file bs=4K of=/dev/null
68157440 bytes (68 MB, 65 MiB) copied, 0.0932216 s, 630 MB/s

Whole file readahead: 9.1 MB/s
Non whole file readahead: 630 MB/s

With this fix (same test as above):
Whole file readahead: 994 MB/s
Non whole file readahead: 630 MB/s (unchanged)

Lustre-change: https://review.whamcloud.com/54011
Lustre-commit: f8b276734f2457880888ce2bfa2349d22062dd64

Fixes: 7864a68 ("LU-12043 llite,readahead: don't always use max RPC size")
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I72f0b58e289e83a2f2a3868ef0d433a50889d4c0
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Tested-by: Shuichi Ihara <sihara@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54276
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
15 months agoLU-16307 tests: fix sanity-sec test_31 better
Sebastien Buisson [Tue, 5 Mar 2024 09:12:16 +0000 (10:12 +0100)]
LU-16307 tests: fix sanity-sec test_31 better

Patch "LU-16307 tests: fix sanity-sec test_31" was landed to this
branch before the master version, which got further improvements.
In particular, the test has been updated to handle IPv6 and numeric
NIDs, and it has been tweaked to run out of tree.

Lustre-change: https://review.whamcloud.com/53818
Lustre-commit: 28b4d02161c38e624efb10d4815856cb9df3dc07

Fixes: 331a0e20dd ("LU-16307 tests: fix sanity-sec test_31")
Test-Parameters: trivial
Test-Parameters: testgroup=review-dne-part-2
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Idd657c7555e598d0ebc08387eac537b1c73e35be
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54279
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-17476 lnet: use bits only to match ME in all cases
Serguei Smirnov [Fri, 16 Feb 2024 19:01:21 +0000 (11:01 -0800)]
LU-17476 lnet: use bits only to match ME in all cases

If NIDs belong to the same peer and matchbits are matching,
declare a match even if matchbits are matched as not available
or ignored

Lustre-change: https://review.whamcloud.com/54082
Lustre-commit: a7ae2e5515879dc31e87106314d35dc439a2c50d

Test-Parameters: testlist=sanity env=ONLY=350,ONLY_REPEAT=10
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: I394c492381a2d069b34516c473220192df05fbd2
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54277
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-17175 gss: background speedtest for Miller-Rabin rounds
Sebastien Buisson [Sat, 17 Feb 2024 12:27:15 +0000 (13:27 +0100)]
LU-17175 gss: background speedtest for Miller-Rabin rounds

The number of rounds used for Miller-Rabin testing of the prime
provided as input parameter to DH_check() is evaluated when the
lsvcgssd daemon starts. This speed test takes between 5 and 10 seconds
so it makes sense to run it in the background.
Any prime tested before the right number of rounds has been determined
would use the default from OpenSSL. This can lead to longer request
processing time, but this is only for a temporary and short period of
time.

Lustre-change: https://review.whamcloud.com/54088
Lustre-commit: d753dc75ad2a919e5fff3bc51c20b4569cd86a86

Test-Parameters: trivial
Test-Parameters: testgroup=review-dne-selinux-ssk-part-1
Test-Parameters: testgroup=review-dne-selinux-ssk-part-2
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: If77f4374c5af463fdadd15979a594af1786af1df
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54278
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
15 months agoEX-9007 lipe: Fix getting client mount path
Vitaliy Kuznetsov [Tue, 5 Mar 2024 16:51:40 +0000 (17:51 +0100)]
EX-9007 lipe: Fix getting client mount path

This patch fixes a "Segmentation fault" issue related
to the generation of size statistics report in '*.json'
format when the client is unmounted

Test-Parameters: trivial testlist=sanity-lipe-scan3
Signed-off-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Change-Id: Ibae2fa2428ae7e202b08acae9354f303bbdbd739
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54284
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Colin Faber <cfaber@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-13805 revert "osd: Implement unaligned DIO connect flag"
Andreas Dilger [Wed, 6 Mar 2024 00:28:24 +0000 (00:28 +0000)]
LU-13805 revert "osd: Implement unaligned DIO connect flag"

This reverts commit 88d324be08a44364aea9ff73c362a5e4ed4aaf6e.

There are further compatibility issues with UDIO on master and
this flag may be used for more than just ZFS compatibility, so
should not be advertised by servers for compatibility just yet.

Change-Id: Id357fa1a735ef4b8d6d90218250888f4ee04e5af
Test-Parameters: trivial
Fixes: 88d324be08 ("LU-13805: osd: Implement unaligned DIO connect flag")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54287
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
15 months agoLU-17317 sec: fix sanity-sec test_28
Sebastien Buisson [Mon, 4 Mar 2024 09:41:08 +0000 (10:41 +0100)]
LU-17317 sec: fix sanity-sec test_28

Improve sanity-sec test_28 to verify that srpc_contexts is valid
YAML output.
Also remove the ctx information from the output, as printing out a
kernel pointer is not ideal.

Lustre-change: https://review.whamcloud.com/54280
Lustre-commit: TBD (from 7ff01038c81a813cd122b28d21964316e7afdf28)

Fixes: 45f2b1ca36 ("LU-17317 sec: add srpc_serverctx proc file")
Test-Parameters: trivial
Test-Parameters: kerberos=true testlist=sanity-krb5
Test-Parameters: testgroup=review-dne-selinux-ssk-part-2
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Ie48dc61adfd5017a2313981f27407c9d3b69dd71
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54281
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-17317 sec: add srpc_serverctx proc file
Sebastien Buisson [Tue, 5 Dec 2023 13:14:58 +0000 (14:14 +0100)]
LU-17317 sec: add srpc_serverctx proc file

GSS srpc contexts for client connections can already be dumped via
proc file <mdc,osc>.*.srpc_contexts.
This patch adds a new proc file to dump server side GSS srpc contexts,
e.g.:
mgs.MGS.gss.srpc_serverctx
mdt.testfs-MDT0000.gss.srpc_serverctx
obdfilter.testfs-OST0000.gss.srpc_serverctx

The GSS context information is dumped as YAML, with one line per
context, like this:
0000000013221bdf: { peer_nid: 192.168.56.206@tcp, uid: 0, ctxref: 1,
expire: 1707934985, delta: 3401, flags: [uptodate, cached], seq: 0,
win: 2048, key: 00000000, keyref: 0,
hdl: "0x5ae1a771fd57043:0x65a64972fda4e200",
mech: "krb5 (aes256-cts-hmac-sha1-96)" }

Because of this new syntax, sanity-sec test_28 needs to be fixed.

Lustre-change: https://review.whamcloud.com/53376
Lustre-commit: f6687bafcb296aa7c152774de65bc865c774c464

Test-Parameters: trivial
Test-Parameters: kerberos=true testlist=sanity-krb5
Test-Parameters: testgroup=review-dne-selinux-ssk-part-2
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I37da9ffe6dd5884006b36271185a4d7155ead65b
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54161
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-17545 lnet: use unsafe_memcpy() when flexible array
Bruno Faccini [Thu, 15 Feb 2024 18:07:00 +0000 (19:07 +0100)]
LU-17545 lnet: use unsafe_memcpy() when flexible array

To avoid <memcpy: detected field-spanning write (size 64)
of single field "&lp->lp_data->pb_info" at
.../lnet/lnet/peer.c:2456 (size 16)> false positive
msgs/error.

Lustre-change: https://review.whamcloud.com/54069
Lustre-commit: 1b936f5b545f804f3ed21a18f90ceafd705291b2

Signed-off-by: Bruno Faccini <bfaccini@nvidia.com>
Change-Id: I4e2fc58e31f60b434a9050393cd65b89c54f0798
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54290
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoEX-7729 osc: Add counters for compressed data
Vitaliy Kuznetsov [Tue, 5 Mar 2024 09:19:25 +0000 (10:19 +0100)]
EX-7729 osc: Add counters for compressed data

This patch is the first of two patches that add counters
to track client/server-side data compression statistics.
This patch add new compr_stats file in osc.*.compr_stats.

From added counters:
1. Size of compressed/uncompressed chunks written/read by
   client to compressed files, in chunk/bytes;
2. Compressed page counter;

Test-Parameters: trivial testlist=sanity-compr
Signed-off-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Change-Id: I091153480e53309c641d39f271bef536296dc09e
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53737
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoEX-7729 osc: Do not iterate over chunk pages
Artem Blagodarenko [Mon, 4 Mar 2024 13:01:09 +0000 (13:01 +0000)]
EX-7729 osc: Do not iterate over chunk pages

If osc_decompress() knows chunk size, no need to iterate
other all pages in chunk. Thay can be skipped.

Signed-off-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Change-Id: Ib84b060075c55c97eba9f74ef017c0a956e85b12
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54270
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-17392 build: compatibility updates for kernel 6.7
Shaun Tancheff [Fri, 8 Mar 2024 01:18:58 +0000 (17:18 -0800)]
LU-17392 build: compatibility updates for kernel 6.7

Linux commit v6.6-rc4-53-gc42d50aefd17
  mm: shrinker: add infrastructure for dynamically allocating
      shrinker

Users of struct shrinker must dynamically allocate shrinker objects
to avoid run-time warnings.

Provide a wrapper for older kernels to alloc+register shinkers
and unregister+free.

Use get_group_info() and put_group_info() wrappers instead of
open coding the reference counting on group_info.usage

Lustre-change: https://review.whamcloud.com/53621
Lustre-commit: TBD (from 7e8165620e1943189d63ce4770c7722fa309a58e)

Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Ie07bdb7fe3eb6060bd84f95f860f1b53d120a605
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54323
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-6142 osc: tidy up osc_init()
Mr. NeilBrown [Fri, 8 Mar 2024 00:57:03 +0000 (16:57 -0800)]
LU-6142 osc: tidy up osc_init()

A module_init() function that registers the services
of the module should do that last, after all other
initialization has succeeded.
This patch moves the class_register_type() call to the
end and ensures everything else that might have been
set up, is cleaned up on error.

Linux-commit: e67f133d02e ("staging: lustre: osc: tidy up osc_init()")

Lustre-change: https://review.whamcloud.com/49458
Lustre-commit: f66b0c3b22bfcf0d7ac9383df5d87317f831a03d

Change-Id: I2a5ffb116c6d7c33a4530bab6e89a5ffe6117cea
Signed-off-by: Mr. NeilBrown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54322
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-6142 lustre: remove module_vars arg to class_register_type()
Mr NeilBrown [Fri, 8 Mar 2024 00:51:43 +0000 (16:51 -0800)]
LU-6142 lustre: remove module_vars arg to class_register_type()

The module_vars arg to class_register_type() is always NULL.  So it
can be removed.

Lustre-change: https://review.whamcloud.com/39383
Lustre-commit: 8d8e87a5ac7e9d072383019228270eb4681a597e

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Ie8d7e79075ba068dec606cc9dfcc38a90e371e5b
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54321
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-17243 build: compatibility updates for kernel 6.6
Shaun Tancheff [Wed, 6 Mar 2024 21:18:54 +0000 (13:18 -0800)]
LU-17243 build: compatibility updates for kernel 6.6

linux kernel v5.19-rc1-4-gc4f135d64382
  workqueue: Wrap flush_workqueue() using a macro
linux kernel v6.5-rc1-7-g20bdedafd2f6
  workqueue: Warn attempt to flush system-wide workqueues.
If __flush_workqueue(system_wq) is not available fall back to
flush_scheduled_work()

linux kernel v6.5-rc1-92-g13bc24457850
  fs: rename i_ctime field to __i_ctime
Use accessors for ctime. Provide replacements for older
kernels.

linux kernel v6.5-rc1-95-g0d72b92883c6
  fs: pass the request_mask to generic_fillattr
Provide request_mask argument where needed.

Linux commit v6.5-rc2-20-g2ddd3cac1fa9
  nsproxy: Convert nsproxy.count to refcount_t
Provide a wrapper for inc/dec of nsproxy.count

linux kernel v6.5-rc4-110-gcf95e337cb63
  mm: delete mmap_write_trylock() and vma_try_start_write()
Use down_write_trylock directly mmap_write_trylock

In preparation for kernel 6.7 the remaining inode time
accessors will be preferred:

linux kernel v6.6-rc5-86-g12cd44023651
  fs: rename inode i_atime and i_mtime fields
Use accessors for atime and mtime. Provide replacements for
older kernels.

Lustre-change: https://review.whamcloud.com/52908
Lustre-commit: TBD (from 223377dea7029118dd9c0deb1958bf7222117009)

Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Ide6c2e3e8db532449850b145c2d61b972d21f649
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54122
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-17081 build: Prefer folio_batch to pagevec
Shaun Tancheff [Wed, 6 Mar 2024 19:33:09 +0000 (11:33 -0800)]
LU-17081 build: Prefer folio_batch to pagevec

Linux commit v5.16-rc4-36-g10331795fb79
  pagevec: Add folio_batch

Linux commit v6.2-rc4-254-g811561288397
  mm: pagevec: add folio_batch_reinit()

Linux commit v6.4-rc4-438-g1e0877d58b1e
  mm: remove struct pagevec

Use folio_batch and provide wrappers for older kernels to use
pagevec handling, conditionally provide a folio_batch_reinit

Add macros to ease adding pages to folio_batch(es) as well
as unwinding batches of struct folio where struct page is
needed.

Lustre-change: https://review.whamcloud.com/52259
Lustre-commit: TBD (from 81c567481b7be1d9d4655a47027918f7a8d16ff8)

HPE-bug-id: LUS-11811
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Ie70e4851df00a73f194aaa6631678b54b5d128a1
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54074
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoEX-8130 lipe: Link a new statistics module
Vitaliy Kuznetsov [Tue, 5 Mar 2024 14:56:23 +0000 (15:56 +0100)]
EX-8130 lipe: Link a new statistics module

This patch links the collection of statistics for
directories with the main collection of statistics
about file sizes. This way we collect all the statistics
at one time.

This patch also adds 2 new options for collecting statistics
via lipe_find3.

The -depth option serves as a limiter on the collection and
output of directory statistics to a file.
The -top-rating option allows adjustment of the size of the
table ranking the largest directories by size.

A mechanism has also been added that aallows a copy of the
path from the filter for correct data processing.

Test-Parameters: trivial testlist=sanity-lipe-scan3,sanity-lipe-find3
Signed-off-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Change-Id: Ia61f57536575fd17e40b6e34a0c4b9b5db9111c5
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53990
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexandre Ioffe <aioffe@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoEX-8130 lipe: Add output for dir sizes stats
Vitaliy Kuznetsov [Tue, 5 Mar 2024 14:49:39 +0000 (15:49 +0100)]
EX-8130 lipe: Add output for dir sizes stats

This patch adds functions for displaying size statistics
for directories in the general report.
This patch adds support for *.out format only.

Test-Parameters: trivial
Signed-off-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Change-Id: Iaf70aa4d84295f1a1a297b00fa45f12fb98c7625
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53983
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexandre Ioffe <aioffe@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoEX-8130 lipe: Add entry point for dirs stats
Vitaliy Kuznetsov [Mon, 19 Feb 2024 15:44:35 +0000 (16:44 +0100)]
EX-8130 lipe: Add entry point for dirs stats

This pr adds a function that is an entry point for
collecting statistics about directory sizes.

Test-Parameters: trivial
Signed-off-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Change-Id: Ide0c6006e287f69a1de99a5578ceab0070ea383e
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53982
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexandre Ioffe <aioffe@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoEX-8130 lipe: Add key func for work with tree
Vitaliy Kuznetsov [Mon, 19 Feb 2024 15:37:42 +0000 (16:37 +0100)]
EX-8130 lipe: Add key func for work with tree

This patch adds two key functions to collect directory size
statistics, which contain the basic logic for adding
directories to memory and incrementing size counters.

Test-Parameters: trivial
Signed-off-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Change-Id: I06e03a6be1052b7178274835169cc41d044ca1ab
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53963
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexandre Ioffe <aioffe@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoEX-8130 lipe: Add helper functions for stats
Vitaliy Kuznetsov [Tue, 5 Mar 2024 14:32:25 +0000 (15:32 +0100)]
EX-8130 lipe: Add helper functions for stats

This patch adds several helper functions for working with
directory size statistics. Also add ls3_stats_rm_first_dir()
which remove the directory from the list to increase counters.

Test-Parameters: trivial
Signed-off-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Change-Id: I04807846b49d6fb0e476b8bf146ba337f80e3d5e
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53962
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexandre Ioffe <aioffe@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoEX-8130 lipe: Add func for working with paths
Vitaliy Kuznetsov [Mon, 19 Feb 2024 19:02:44 +0000 (20:02 +0100)]
EX-8130 lipe: Add func for working with paths

This patсh adds directory path processing helper functions
that will be used later to collect directory size statistics.

These functions set the stage for working on increasing the
size counters for each directory, along the entire chain in
the file or directory path.

Test-Parameters: trivial
Signed-off-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Change-Id: I6e7302e9771dce2933c6730a1117fec3bc2b0fda
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53961
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexandre Ioffe <aioffe@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoEX-8130 lipe: Directory scan size stats
Vitaliy Kuznetsov [Tue, 5 Mar 2024 14:52:10 +0000 (15:52 +0100)]
EX-8130 lipe: Directory scan size stats

This patch adds functionality for creating new
directories and expanding memory for new child
directories in memory. Adds a function to
initialize the starting directory.

Test-Parameters: trivial
Signed-off-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Change-Id: I3ff6a62ffd9d6535ed4434f517d1c93d6ae01b34
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53960
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoEX-8130 lipe: Add functions to create a rating
Vitaliy Kuznetsov [Mon, 19 Feb 2024 20:25:29 +0000 (21:25 +0100)]
EX-8130 lipe: Add functions to create a rating

This patch adds new functionality to directory statistics
for working with a ranking table for the largest
directories (like TOP 100).

The creation of a structure for storing the rating occurs
after the lipe_scan3 scan is completed. The number of
objects in the structure is determined before lipe_scan3
is launched by the default value or by the user in lipe_find3
via the -top-rating option and is not expanded while lipe_scan3
is running. Adding new objects to the heap works by the logic
of replacing the object with the smallest size in the heap with
a new object if its size is larger. Adding objects to the heap
occurs when printing the results about the directory sizes,
since only in this case do we know the final sizes of the directories.

Test-Parameters: trivial
Signed-off-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Change-Id: Ie4a449fe69022716232638e0f856a10850403831
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53959
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexandre Ioffe <aioffe@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoEX-8130 lipe: Add directory scan structs
Vitaliy Kuznetsov [Tue, 5 Mar 2024 14:21:21 +0000 (15:21 +0100)]
EX-8130 lipe: Add directory scan structs

This patch adds new structures to lipe_scan3 for collecting
and storing directory statistics, as well as initialization
and destroy functions. This patch is the first in a series
of patches that add functionality for collecting directory
statistics in lipe_find3.

Test-Parameters: trivial
Signed-off-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Change-Id: Ib14ce13677d93d1a53299501138e78c7b290793c
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53958
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexandre Ioffe <aioffe@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoRM-620 build: New tag 2.14.0-ddn137
Andreas Dilger [Sun, 3 Mar 2024 10:31:20 +0000 (03:31 -0700)]
RM-620 build: New tag 2.14.0-ddn137

New tag 2.14.0-ddn137

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Icd060c758aa849abcefb7517f0d120a679a5a1b5

15 months agoLU-17500 qmt: avoid "enforced bit set, but neither"
Sergey Cheremencev [Fri, 2 Feb 2024 20:07:00 +0000 (23:07 +0300)]
LU-17500 qmt: avoid "enforced bit set, but neither"

Don't call qmt_revalidate_qunit in qmt_set_with_lqe
as it is possible that lqe_enforced bit is not cleared
in case when hard and soft limits are setting to 0.
No reasons to recalculate qunit and edquot when we
set limits to 0. For the case when limits are changed,
qunit and edquot will be calculated below in "dirtied"
branch. So not reasons to do this 2 times.

Patch helps to avoid following error:
LustreError: 21362:0:(qmt_entry.c:746:qmt_adjust_qunit())
  $$$ enforced bit set, but neither hard nor soft limit are set

Lustre-change: https://review.whamcloud.com/53893
Lustre-commit: 7498e7c38dffe23752b03bf168f3b5419855b10b

Signed-off-by: Sergey Cheremencev <scherementsev@ddn.com>
Change-Id: I8f5d9630f43b66ae7ea2be0bf2c735a02e1f6299
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54185
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-17481 mdt: count all opens in mdt.*.md_stats
Yang Sheng [Thu, 1 Feb 2024 16:31:13 +0000 (00:31 +0800)]
LU-17481 mdt: count all opens in mdt.*.md_stats

Count all of opens for mdt. Also add a test case to
verify it.

Lustre-change: https://review.whamcloud.com/53880
Lustre-commit: 055f939979b20eb769803ecffd0caa53c440ad7d

Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: I2fa90cc2b4ce8d7d039736a5f40a70cbeb04bf8c
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54181
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
15 months agoLU-8130 obd: remove limit on client mounts
Andreas Dilger [Mon, 26 Feb 2024 20:15:21 +0000 (13:15 -0700)]
LU-8130 obd: remove limit on client mounts

Using the in-kernel rhashtable instead of cfs_hash_table
for obd->obd_uuid_hash has a side effect of limiting number
of elements in the hash table and thereby limits max number
of Lustre clients by 16384.

The patch raises the limit to 2^31 (rhashtable default).

Fixes: e40b008e88 ("LU-8130 obd: convert obd uuid hash to rhashtable")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I222a6d0d2789ea9d1bb3530b3619d08ec83ebbe5
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54186
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
15 months agoLU-17454 nodemap: allow mapping for root
Sebastien Buisson [Wed, 31 Jan 2024 14:40:44 +0000 (15:40 +0100)]
LU-17454 nodemap: allow mapping for root

Allow an id mapping for root, to match what is implemented for regular
users, with the following behavior:
- if admin property is set, root remains root.
- if admin property is not set, the idmap for '0' is taken into
  account.
- if admin property is not set and there is no idmap for '0' and
  deny_unknown property is not set, root is squashed to the squash
  uid/gid.
- if admin property is not set and there is no idmap for '0' and
  deny_unknown property is set, root is blocked.

Note that map_mode remains ignored for root. Also, capabilities are
not dropped for root when mapped, just like it is done for regular
users. If admins want to drop root capabilities, root must be
squashed.

sanity-sec test_15 is updated to test root mapping.

Lustre-change: https://review.whamcloud.com/53870
Lustre-commit: b4a336d0ce91c05ae48544b3fd2e56f0bcb0a8cf

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Id2e950b99e3b3ba27179408c647e1f7b7c49e32e
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54159
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
15 months agoLU-17465 nodemap: change squash default value to 65534
Sebastien Buisson [Tue, 23 Jan 2024 09:07:25 +0000 (10:07 +0100)]
LU-17465 nodemap: change squash default value to 65534

Initially, default values for nodemap.squash_uid/gid/projid were set
to 99, to match user 'nobody'. But on newer systems, nobody has
changed to 65534 and 99 no longer exists.
It is safe to use 65534 in all cases, as even on older systems it
exists and corresponds to 'nfsnobody'.

Lustre-change: https://review.whamcloud.com/53802
Lustre-commit: d4927da410525db5f0524d618da47a17fe9c7835

Test-Parameters: testlist=sanity env=ONLY=432 serverversion=EXA5
Test-Parameters: testlist=sanity env=ONLY=432 clientversion=EXA5
Test-Parameters: testlist=sanity-quota env=ONLY=75 serverversion=EXA5
Test-Parameters: testlist=sanity-quota env=ONLY=75 clientversion=EXA5
Test-Parameters: testlist=sanity-selinux env=ONLY=21 serverversion=EXA5
Test-Parameters: testlist=sanity-selinux env=ONLY=21 clientversion=EXA5
Test-Parameters: testlist=sanity-sec env=ONLY="7 8 9 10 11 12 13 14 15 16 18 19 20 21 22 23 24 25 26 27 32 33 34 35 36 55 61 64" serverversion=EXA5
Test-Parameters: testlist=sanity-sec env=ONLY="7 8 9 10 11 12 13 14 15 16 18 19 20 21 22 23 24 25 26 27 32 33 34 35 36 55 61 64" clientversion=EXA5
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I2e20fda0fdc0d5bfdf964a890bfbd0b54b943cf4
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53777
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
15 months agoLU-17357 mgc: wait for sptlrpc config log
Sebastien Buisson [Tue, 12 Dec 2023 16:49:49 +0000 (17:49 +0100)]
LU-17357 mgc: wait for sptlrpc config log

The sptlrpc config log is mandatory to establish connections to
targets with proper security context. So wait for its retrieval.

Add sanity-sec test_68 to exercise this, and improve test_32
for mgssec.

Lustre-change: https://review.whamcloud.com/53423
Lustre-commit: 4a3e428361a03b4bc777eddd466ba1ff8b72b51e

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I5352e926dc6a9a68db1224629c68a42b74bee8a4
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54160
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
15 months agoLU-17563 kernel: update SLES15 SP5 [5.14.21-150500.55.49.1]
Jian Yu [Fri, 1 Mar 2024 23:52:23 +0000 (15:52 -0800)]
LU-17563 kernel: update SLES15 SP5 [5.14.21-150500.55.49.1]

Update SLES15 SP5 kernel to 5.14.21-150500.55.49.1 for Lustre client.

Lustre-change: https://review.whamcloud.com/54240
Lustre-commit: TBD (from fb361e2001c2e7fd34faea82236d427861e16ade)

Test-Parameters: trivial mdtcount=4 mdscount=2 \
  clientdistro=sles15sp5 testlist=sanity

Test-Parameters: optional clientdistro=sles15sp5 testgroup=full-part-1
Test-Parameters: optional clientdistro=sles15sp5 testgroup=full-part-2
Test-Parameters: optional clientdistro=sles15sp5 testgroup=full-part-3

Change-Id: I23868ff25ae093a52f004e556789805a644832ac
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54244
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-17593 kernel: update RHEL 8.9 [4.18.0-513.18.1.el8_9]
Jian Yu [Fri, 1 Mar 2024 23:47:13 +0000 (15:47 -0800)]
LU-17593 kernel: update RHEL 8.9 [4.18.0-513.18.1.el8_9]

Update RHEL 8.9 kernel to 4.18.0-513.18.1.el8_9 for Lustre client.

Lustre-change: https://review.whamcloud.com/54238
Lustre-commit: TBD (from eee579bfc2f1e1a8c02e76c3a82701920b0703ff)

Test-Parameters: trivial mdtcount=4 mdscount=2 \
  clientdistro=el8.9 testlist=sanity

Test-Parameters: optional clientdistro=el8.9 testgroup=full-part-1
Test-Parameters: optional clientdistro=el8.9 testgroup=full-part-2
Test-Parameters: optional clientdistro=el8.9 testgroup=full-part-3

Change-Id: I2c928e4c08af278dacce1d1dc7a14fa77ffffa33
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54243
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-17561 kernel: update RHEL 9.3 [5.14.0-362.18.1.el9_3]
Jian Yu [Fri, 1 Mar 2024 23:37:58 +0000 (15:37 -0800)]
LU-17561 kernel: update RHEL 9.3 [5.14.0-362.18.1.el9_3]

Update RHEL 9.3 kernel to 5.14.0-362.18.1.el9_3 for Lustre client.

Test-Parameters: trivial mdtcount=4 mdscount=2 \
  clientdistro=el9.3 testlist=sanity

Test-Parameters: optional clientdistro=el9.3 testgroup=full-part-1
Test-Parameters: optional clientdistro=el9.3 testgroup=full-part-2
Test-Parameters: optional clientdistro=el9.3 testgroup=full-part-3

Lustre-change: https://review.whamcloud.com/54236
Lustre-commit: TBD (from 2bbdc9e49055de2eda43a6d4b745543f8e354740)

Change-Id: Iddfe57197d854e0be864c0ce64699f92fcc181d1
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54242
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-13805 osd: Implement unaligned DIO connect flag
Andreas Dilger [Fri, 1 Mar 2024 22:18:39 +0000 (15:18 -0700)]
LU-13805 osd: Implement unaligned DIO connect flag

Unupgraded ZFS servers may crash if they received unaligned
DIO, so we need a compat flag and a test to recognize those
servers.

This patch extracts server-side logic from two master patches
to improve interop testing, but does not implement client UDIO.

Lustre-change: https://review.whamcloud.com/51126
Lustre-commit: 0e6e60b1233b08952c338b2c4f121ef749a99f8b
Was-Change-Id: I5d6ee3fa5dca989c671417f35a981767ee55d6e2

Lustre-change: https://review.whamcloud.com/45616
Lustre-commit: 7194eb6431d2ef7245ef3b13394b60e220145187
Was-Change-Id: I7eeebf9a608f006c8095b95f0677adb99f19d640

Test-Parameters: trivial testlist=sanity env=ONLY=56 fstype=zfs
Test-Parameters: testlist=sanity env=ONLY=56 clientbuildno=4505 clientjob=lustre-master clientdistro=el8.8
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I8b987c00f741a884ba28c18309cc2f90baf4809a
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54239
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
15 months agoLU-13805 obd: Reserve unaligned DIO connect flag
Patrick Farrell [Wed, 9 Aug 2023 16:16:25 +0000 (12:16 -0400)]
LU-13805 obd: Reserve unaligned DIO connect flag

Unaligned DIO generally requires only client changes, but
an assert must be removed from ZFS servers for it to work
correctly.  This means we need a connect flag to recognize
whether or not a server running ZFS can safely use
unaligned DIO.

All OSTs will present this flag - to keep things simple -
but if the flag is not present, we'll still do unaligned
DIO to ldiskfs OSTs.

Actual implementation will be in another patch, this one
just creates the flag itself.

Lustre-commit: https://review.whamcloud.com/51075
Lustre-change: 4c96cbf89dba5e4bf8ddf98a18b72142c22a4289

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I8b149cc54f4fb11e64182c65f2fbb01f8a3d3868
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53708
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
15 months agoLU-16518 ptlrpc: fix clang build errors
Timothy Day [Wed, 21 Feb 2024 21:14:17 +0000 (13:14 -0800)]
LU-16518 ptlrpc: fix clang build errors

Fixed bugs which cause errors on Clang.

The majority of changes involve adding
defines for the 'ptlrpc_nrs_ctl' enum.
This avoids having to explicitly cast
enums from one type to another.

An unused variable 'req' was removed from
'nrs_tbf_req_get'. A 'strlcpy' in
'sptlrpc_process_config' was copying the
wrong number of bytes. Another variable,
'rc' in 'sptlrpc_lproc_init', seemed to
be neglected unintentionally; this was also
fixed.

Lustre-change: https://review.whamcloud.com/49859
Lustre-commit: 50f28f81b5aa8f8ad1c8585bd7e262910f936e50

Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: If994c625199b392198f944f9cd21bbf2142bce69
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54131
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-16962 build: parallel configure cleanup
Shaun Tancheff [Wed, 21 Feb 2024 20:24:53 +0000 (12:24 -0800)]
LU-16962 build: parallel configure cleanup

LC_REGISTER_SHRINKER_FORMAT_NAMED macro should use
  register_shrinker_format

Lustre-change: https://review.whamcloud.com/51670
Lustre-commit: 1e9d48625b9a99d651a2e96cf947b60723713304

LU-16962 build: parallel header checks

Add LB2_CHECK_LINUX_HEADER_SRC and LB2_CHECK_LINUX_HEADER_RESULT
macros to use for running header checks in parallel.

Migrate (most) header checks to parallel and run a subset
early as the results of those tests are required by other
configure tests.

Lustre-change: https://review.whamcloud.com/51673
Lustre-commit: 2e025641ef087f159ca000ff3c4acb3ce886b8a3

Test-Parameters: trivial
HPE-bug-id: LUS-11709
HPE-bug-id: LUS-11710
Fixes: 0006eb3644 ("LU-16328 llite: migrate_folio, vfs_setxattr")
Fixes: ca992899d5 ("LU-16351 llite: Linux 6.1 prandom, folios_contig, vma_iterator")
Fixes: 7fe7f4ca06 ("LU-16520 build: Move strscpy to libcfs common header")
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I0cb630d035a23edfa353040f4c0d25c46eb417d8
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54121
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-16957 build: Improve parallel --config-cache
Shaun Tancheff [Wed, 21 Feb 2024 20:22:16 +0000 (12:22 -0800)]
LU-16957 build: Improve parallel --config-cache

The parallel build should consider the configure cache before
adding tests to the parallel build pass.

Track the number of compile tests needed, skip the make when
no build tests are needed.

Also unify libcfs, core, and ldiskfs build passes to a single step.

Configure timings vs master

     master       master w/cache  |     patch         patch w/cache
 --------------   --------------- | ---------------  ----------------
 real  1m3.493s   real  0m34.024s | real  1m3.903s    real  0m8.404s
 user 1m34.587s   user  1m16.547s | user  1m37.191s   user  0m4.292s
 sys  0m35.119s   sys   0m22.687s | sys   0m35.297s   sys   0m5.514s

Lustre-change: https://review.whamcloud.com/51637
Lustre-commit: 0dfeed23d67fe5b3f283ec5b9671c94f0fe2303f

Test-Parameters: trivial
HPE-bug-id: LUS-11706
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I6696b350e8315190a67c1463435b18a87d45813e
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54130
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-16793 build: Enable compile tests to require <module>.ko
Shaun Tancheff [Wed, 21 Feb 2024 20:18:06 +0000 (12:18 -0800)]
LU-16793 build: Enable compile tests to require <module>.ko

Currently the build tests only demand a kernel api test
create an object (.o).

Cases that have a missing symbol export, directly or
indirectly, will generate an object file and fail to
generate a kernel module (.ko).

Enable tests to select the stricter criteria.

Lustre-change: https://review.whamcloud.com/50849
Lustre-commit: 581db5e89e0d690961e49278a7b50ecce78e5a22

Test-Parameters: trivial
Fixes: cc5594df3e ("LU-16759 o2ib: MOFED 5.5+ ib_dma_virt_map_sg")
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Iae481f1287023ea6c2432d147c497fa0a55fd689
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54129
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-17081 build: compatibility for 6.5 kernels
Shaun Tancheff [Fri, 16 Feb 2024 07:17:26 +0000 (23:17 -0800)]
LU-17081 build: compatibility for 6.5 kernels

Linux commit v6.4-rc2-29-gc6585011bc1d
  splice: Remove generic_file_splice_read()

Prefer filemap_splice_read and provide alternates for older kernels.

Linux commit v6.4-rc2-30-g3fc40265ae2b
  iov_iter: Kill ITER_PIPE

ITER_PIPE and iov_iter_is_pipe() are removed, provide a replacement
for iov_iter_is_pipe

Linux commit v6.4-rc4-53-g54d020692b34
  mm/gup: remove unused vmas parameter from get_user_pages()

Use vma_lookup() to acquire the vma following get_user_pages()

Linux commit v6.4-rc7-1884-gdc97391e6610
  sock: Remove ->sendpage*() in favour of sendmsg(MSG_SPLICE_PAGES)
Use sendmsg when MSG_SPLICE_PAGES is defined. Provide a wrapper
using sendpage() for older kernels.

Lustre-change: https://review.whamcloud.com/52258
Lustre-commit: 2bb54b6383d57ac61092593b9e6d9c80801263f5

HPE-bug-id: LUS-11811
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I95a0954a602c8db08d30b38a50dcd50107c8f268
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54055
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-17161 build: Avoid fortify_memset in OBD_FREE_PTR
Shaun Tancheff [Thu, 15 Feb 2024 01:26:00 +0000 (17:26 -0800)]
LU-17161 build: Avoid fortify_memset in OBD_FREE_PTR

OBD_FREE_PTR will optionally clear the about to be free()d
memory.

Unfortunately fortify_memset_chk() hits some false positives.

We can use __underlying_memset() if it is defined, to avoid
the fortify_memset_chk.

Lustre-change: https://review.whamcloud.com/52559
Lustre-commit: 58cc8cf98e37e9d8149d5f605a75d56f2cd4eb70

Test-Parameters: trivial
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Iced53f22b97ed90e0970625c4fcbaa404054c54a
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53956
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-16518 build: llvm/clang support
Timothy Day [Thu, 15 Feb 2024 00:43:19 +0000 (16:43 -0800)]
LU-16518 build: llvm/clang support

Other projects, notably Linux, have build support for LLVM and
Clang via special environment variables. This is implemented
for Lustre, in the style of:

https://www.kernel.org/doc/html/latest/kbuild/llvm.html

Instances in which GCC is explicitly called are replaced by the
use of $CC. The proper environment variables as passed to make
invocations as needed.

All checks which influence global compiler and toolchain settings
are collected in 'config/lustre-toolchain.m4'.

A configure option is added to disable the strict error flags that
are passes to the C compiler by default. CFLAGS and EXTRA_CFLAGS
are made to work in the typical way. Having fine grained control
over compiler options makes experimenting with Clang smoother.

Some compile checks in 'lustre-core.m4' have been improved by using
unused variables and explicitly setting the compile flag to be used
during the test.

This also sets the execute bit on autogen.sh.

Tested with:
Linux (mainline) - 5.15.94
openZFS - 2.1.99
Lustre (latest master) - 2.15.55
CentOS - 8.5
Clang (default on CentOS) - 12.0.1

Lustre-change: https://review.whamcloud.com/50063
Lustre-commit: 7f1aa5b66b247f339a9e7c25415a9a5dd272763c

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Ia8654c22fa8fca7bfb96c545ac144a1d3737fa00
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54054
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-10994 clio: remove cpo_assume, cpo_unassume, cpo_fini
John L. Hammond [Wed, 14 Feb 2024 21:07:19 +0000 (13:07 -0800)]
LU-10994 clio: remove cpo_assume, cpo_unassume, cpo_fini

Remove the cl_page methods cpo_assume, cpo_unassume, and
cpo_fini. These methods were only implemented by the vvp layer and so
they can be easily inlined into cl_page_assume() and
cl_page_unassume().

Lustre-change: https://review.whamcloud.com/47373
Lustre-commit: 9045894fe0f5033334a39a35a6332dab4498e21e

LU-6142 clio: make cp_ref in cl_page a refcount_t

As this is used as a refcount, it should be declared
as one.

Lustre-change: https://review.whamcloud.com/49072
Lustre-commit: e19804a3b7e793a11b1c8b5e0db9f6315f243b8c

Signed-off-by: Mr. NeilBrown <neilb@suse.de>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Change-Id: I260c5593983bac6742cf7577c26a4903e95ceb7c
Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54037
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-10994 clio: remove cpo_own and cpo_disown
John L. Hammond [Wed, 14 Feb 2024 09:17:56 +0000 (01:17 -0800)]
LU-10994 clio: remove cpo_own and cpo_disown

Remove the cpo_own and cpo_disown methods from struct
cl_page_operations. These methods were only implemented by the vvp
layer so they can be inlined into cl_page_own0() and
cl_page_disown(). Move most of vvp_page_discard() and all of
vvp_transient_page_discard() into cl_page_discard().

Lustre-change: https://review.whamcloud.com/47372
Lustre-commit: 81c6dc423ce4c62a64d328e49697d26194177f9f

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: I3f156d6ca3e4ea11c050b2addda38e84a84634b9
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54035
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-10994 clio: remove cl_page_export() and cl_page_is_vmlocked()
John L. Hammond [Wed, 14 Feb 2024 20:06:24 +0000 (12:06 -0800)]
LU-10994 clio: remove cl_page_export() and cl_page_is_vmlocked()

Remove cl_page_export() and cl_page_is_vmlocked(), replacing them with
direct calls to PageSetUptodate() and PageLoecked().

Lustre-change: https://review.whamcloud.com/47241
Lustre-commit: 3d52a7c5753e80e78c3b6f6bb7a0b66b37f4849b

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: I883d1664f4afc7a1d4006f9f4833db8125c0e8f5
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54034
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-10994 echo: remove client operations from echo objects
John L. Hammond [Wed, 14 Feb 2024 20:00:59 +0000 (12:00 -0800)]
LU-10994 echo: remove client operations from echo objects

Remove the client (io, page, lock) operations from echo_client
objects. This will facilitate the simplification of CLIO.

Lustre-change: https://review.whamcloud.com/47240
Lustre-commit: 6060ee55b194e37e87031c40e9d48f967eabe314

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: If9e55c7d54c171aa2e1bcf272641c2bd6be8ad48
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54046
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-10994 test: remove netdisk from obdfilter-survey
John L. Hammond [Wed, 14 Feb 2024 19:29:24 +0000 (11:29 -0800)]
LU-10994 test: remove netdisk from obdfilter-survey

Remove the netdisk case from obdfilter-survey. Remove subtests that
use echo_client over osc devices.

Lustre-change: https://review.whamcloud.com/47239
Lustre-commit: 51c491dac6aec99fc328732b4358e8d5732dc230

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: I260001241cee3027f68e62077e5817221bd0c08b
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54044
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-10994 lov: remove lov_page
John L. Hammond [Wed, 14 Feb 2024 08:57:14 +0000 (00:57 -0800)]
LU-10994 lov: remove lov_page

Remove the lov page layer since it does nothing but costs 24 bytes per
page plus pointer chases.

Lustre-change: https://review.whamcloud.com/47221
Lustre-commit: 56f520b1a4c9ae64caa235e9ce7699e7fb627f0c

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: Icd7b4b0041e0fe414a3a4143179f45845177960e
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54033
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-15477 osc: osc_extent_wait() deadlock
Andriy Skulysh [Wed, 14 Feb 2024 08:49:30 +0000 (00:49 -0800)]
LU-15477 osc: osc_extent_wait() deadlock

Thread 1:
vvp_io_write_commit
osc_io_commit_async
osc_page_cache_add
osc_extent_find
osc_extent_wait

Thread 2:
ptlrpcd_check
ptlrpc_check_set
brw_queue_work
osc_extent_make_ready
vvp_page_make_ready_start
__lock_page

We must not hold a page lock while we do osc_extent_find()

Lustre-change: https://review.whamcloud.com/46281
Lustre-commit: 821a8d7b481d34a54044dfe871e4532f0996de8a

Change-Id: Idf669bc8d9c943f28e3f5986826b9637d66ecfca
HPE-bug-id: LUS-10414
Fixes: a7299cb012 "LU-9920 vvp: dirty pages with pagevec"
Signed-off-by: Andriy Skulysh <andriy.skulysh@hpe.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54032
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-11290 osc: Batch gang_lookup cbs
Patrick Farrell [Wed, 14 Feb 2024 08:44:47 +0000 (00:44 -0800)]
LU-11290 osc: Batch gang_lookup cbs

The osc_page_gang_lookup call backs can be trivially
converted to operate in batches rather than one page at a
time.  This improves cancellation time for locks protecting
large numbers of pages by about 10% (after landing
another optimization (LU-11290 ldlm: page discard speedup)
it shows 6% for canceling a lock for 30GB cached file ).

Truncate to zero time (with one lock protecting many pages)
was improved by about 5-10% as well.  Lock weighing
performance should be improved slightly as well, but is
tricky to benchmark.

Lustre-change: https://review.whamcloud.com/33089
Lustre-commit: 0d6d0b7bc95a82dee02d35d0a8a41d24692cad45

HPE-bug-id: LUS-6432
Change-Id: Ib30594ae97182cbeb18051d6cee860c97ae7e119
Signed-off-by: Patrick Farrell <paf@cray.com>
Signed-off-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54031
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-14047 lustre: change EWOULDBLOCK to EAGAIN
John L. Hammond [Wed, 14 Feb 2024 08:39:27 +0000 (00:39 -0800)]
LU-14047 lustre: change EWOULDBLOCK to EAGAIN

On linux, EWOULDBLOCK has always been defined as an alias for
EAGAIN. In the interest of readability we should not use two names for
the same thing. So change the remaining uses of EWOULDBLOCK to EAGAIN
and add EWOULDBLOCK||EAGAIN to spelling.txt.

Lustre-change: https://review.whamcloud.com/40307
Lustre-commit: a7f48e6c15e28617793d89958c79e9ed8cb73e65

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: Ib48b8a1e58bfa961d2a4ba411c038c476bfc300d
Reviewed-by: Yingjin Qian <qian@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54030
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
15 months agoRM-620 build: New tag 2.14.0-ddn136
Andreas Dilger [Sat, 24 Feb 2024 03:53:00 +0000 (20:53 -0700)]
RM-620 build: New tag 2.14.0-ddn136

New tag 2.14.0-ddn136

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: If89f3f4d33b83da86a63b998d141793db509c013

15 months agoEX-8669 llite: set STATX_ATTR_COMPRESSED flag in ll_iocontrol()
Jian Yu [Wed, 7 Feb 2024 09:08:04 +0000 (01:08 -0800)]
EX-8669 llite: set STATX_ATTR_COMPRESSED flag in ll_iocontrol()

This patch extracts the compression flag LUSTRE_COMPR_FL from
mbo_flags and set STATX_ATTR_COMPRESSED flag in ll_iocontrol()
to please lsattr and other e2fsprogs tools.

Test-Parameters: trivial testlist=sanity-compr

Change-Id: I14d2082a6719a1ca5708f7aef7a2fb0f085ca63c
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53953
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoEX-9107 ldiskfs: sync ext4-mballoc-dense with master
Alex Zhuravlev [Wed, 31 Jan 2024 20:05:49 +0000 (23:05 +0300)]
EX-9107 ldiskfs: sync ext4-mballoc-dense with master

extend ac_flags to fit new EXT4_MB_VERY_DENSE

Fixes: f36eda6a1e ("LU-10026 osd-ldiskfs: use preallocation for dense writes")
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Id024cbca902d56728133d7d3e69d56fc355c1bc1
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53871
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
15 months agoLU-17034 quota: lqeg_arr memmory corruption
Sergey Cheremencev [Fri, 25 Aug 2023 06:22:26 +0000 (10:22 +0400)]
LU-17034 quota: lqeg_arr memmory corruption

Fix memory corruption caused by accessing memory
out of array lqeg_arr. It could happen when at least
one of OSTs has index larger than the whole number
of OSTs. For example, if the system has 4 OSTs with
indexes 0001, 0002, 00c9, 00ca. This issue more often
corrupted bucket_table in obd_uuid_hash or obd_nid_hash
causing to crash rhashtable code. However, it could
be the reason of other panics depending on the type
of corrupted neighbour memory region.

This patch adds an lge_idx field to each lqe global entry
to store index of the OST. It is needed to map OST index
to the array index to avoid out-of-bound array access.

This patch also add locking to protect lqe_glbl_data in
qmt_set_revoke and qmt_clear_lgeg_arr_nu. This was
forgotten in 50ff4d1da6.

This patch begins to store all connected MDTs in the quota
global pool. Thus handling MDTs beginning from this patch
is the same with OSTs stored in the global pool. It is the
1st step to introduce MDT pools.

Add conf-sanity_33c that reproduces mentioned memory
corruption without the fix.

Lustre-change: https://review.whamcloud.com/52094
Lustre-commit: 67f90e42889ff22d574e82cc647f6076e48c65a5

Fixes: 50ff4d1da6 ("LU-16772 quota: protect lqe_glbl_data in qmt_site_recalc_cb")
Signed-off-by: Sergey Cheremencev <scherementsev@ddn.com>
Change-Id: Id6e4bcde09d9f32726d69f711eedb82729a2266e
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53810
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
15 months agoLU-17034 revert: "quota: tmp fix against memory corruption"
Sergey Cheremencev [Thu, 18 Jan 2024 19:03:50 +0000 (22:03 +0300)]
LU-17034 revert: "quota: tmp fix against memory corruption"

This reverts commit fdcb1144c95908bbbd0216ec931ac5f222f484a7
as it was a temporary solution. Instead of that will be landed
"LU-17034 quota: lqeg_arr memmory corruption".

Signed-off-by: Sergey Cheremencev <scherementsev@ddn.com>
Change-Id: I6c057ff7e0f9c8789190c51c14fc370afe0c703c
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53809
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-17334 lmv: handle object created on newly added MDT
Lai Siyao [Thu, 7 Dec 2023 12:39:09 +0000 (07:39 -0500)]
LU-17334 lmv: handle object created on newly added MDT

When a new MDT is added to a filesystem without no_create, then a new
object is created on the MDT relatively quickly after it is added to
the filesystem, in particular because the new MDT would be preferred
by QOS space balancing due to lots of free space. However, it might
take a few seconds for the addition of the new MDT to be propagated
across all of the clients, so there is a risk that one client creates
a directory on an MDT that a client is not yet aware of, which returns
an error to the application immediately.

This patch fixes the issue by adding lmv_tgt_retry() that will retry
to use the MDT and wait for some number of seconds for the filesystem
layout to be updated if the MDT index an existing file/directory is
not found.

Commands that depend on user input, like 'lfs mkdir -i' and 'lfs df'
and round-robin MDT allocation will continue to use lmv_tgt() which
doesn't retry in case user specifies wrong MDT index, otherwise it can
hang the command for an extended period of time.

Lustre-change: https://review.whamcloud.com/53363
Lustre-commit: 94a4663db95656ade6b6e695b849cd7763f0bd49

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Idb0cf65e95f665628d6799298732b7a06cde4a86
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54018
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
15 months agoLU-17469 llite: hold object reference in IO
Bobi Jam [Mon, 22 Jan 2024 12:14:56 +0000 (20:14 +0800)]
LU-17469 llite: hold object reference in IO

There could be a race between page write and inode free, hold
a cl_object reference during the IO lest accessing freed object.

Lustre-change: https://review.whamcloud.com/53819
Lustre-commit: TBD (from a84242bc202e402664a5f5d7461b66c770896851)

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Ic70cc27430e68265aba0662fc68e9bfe2f86cfe1
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53760
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <paf0187@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoDDN-4630 sec: protect against concurrent mi_ginfo change
Sebastien Buisson [Thu, 22 Feb 2024 12:44:57 +0000 (13:44 +0100)]
DDN-4630 sec: protect against concurrent mi_ginfo change

With the INTERNAL upcall mechanism, we put in the upcall cache the
groups received from the client, by appending them to a list built
from previous requests.
An existing entry is never modified once it is marked as VALID, it is
replaced with a new one, with a larger groups list. However, the group
info associated with an entry can change when updated from NEW to
VALID. This means the number of groups can only grow from 0 (group
info not set) to the current number of collected groups.
In case of concurrent cache entry update, we need to check the group
info and start over adding the groups associated with the current
request.

Fixes: 4515e5365f ("LU-17015 build: rework upcall cache")
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Ie7088bdbfcae396602b59e2ab07fbfbbb14d96af
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54146
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-16297 ptlrpc: don't panic during reconnection
Alexander Boyko [Thu, 3 Nov 2022 11:23:20 +0000 (07:23 -0400)]
LU-16297 ptlrpc: don't panic during reconnection

ptlrpc_send_rpc() could race with ptlrpc_connect_import_locked()
in the middle of assertion check and this leads to a wrong panic.
Assertion checks

(AT_OFF || imp->imp_state != LUSTRE_IMP_FULL ||

reconnect changes import state and flags
and second part

(imp->imp_msghdr_flags & MSGHDR_AT_SUPPORT) ||
!(imp->imp_connect_data.ocd_connect_flags & OBD_CONNECT_AT)))

MSGHDR_AT_SUPPORT is disabled during client reconnection.
It is not good to use locking at this hot part, so fix changes
assertion to a report.

Lustre-change: https://review.whamcloud.com/49029
Lustre-commit: df31c4c0b39b8845911344e6fadc008bcba40bb1

HPE-bug-id: LUS-10985
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: Ifc9e413c679c3e8a4c8f4f541251bebabae41c82
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54086
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
15 months agoLU-16281 clio: append to non-existent component
Vitaly Fertman [Tue, 5 Jul 2022 21:00:58 +0000 (00:00 +0300)]
LU-16281 clio: append to non-existent component

should return an error, but it fails now with a BUG below
because @rc of lov_io_layout_at() is not checked for < 0

    stripe_width()) ASSERTION( index < lsm->lsm_entry_count ) failed:
    BUG: unable to handle kernel paging request at ffff99d3c2f74030
    Call Trace:
      lov_stripe_number+0x19/0x40 [lov]
      lov_page_init_composite+0x103/0x5f0 [lov]
      ? kmem_cache_alloc+0x12e/0x270
      cl_page_alloc+0x19f/0x660 [obdclass]
      cl_page_find+0x1a0/0x250 [obdclass]
      ll_write_begin+0x1f7/0xfb0 [lustre]

Lustre-change: https://review.whamcloud.com/48994
Lustre-commit: 8fdeca3b6faf22c72f6687aa23b86715d39ceeb1

HPE-bug-id: LUS-11075
Signed-off-by: Vitaly Fertman <vitaly.fertman@hpe.com>
Change-Id: I4371f56cd9cdb3429d52a283831fb0a768e5c9c3
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54133
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Zhenyu Xu <bobijam@hotmail.com>
15 months agoLU-14441 mdc: check/grab import before access
Alex Zhuravlev [Mon, 13 Dec 2021 08:27:42 +0000 (11:27 +0300)]
LU-14441 mdc: check/grab import before access

to ensure the import doesn't disappear while being accessed
via procfs.

Lustre-change: https://review.whamcloud.com/41681
Lustre-commit: b8416320b381ae8a6fdd058b0a09ea42ce56d573

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I005c96b349e55646996fd0d265ab4dd1e2b9a1fa
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54126
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-17484 gss: reply error for SEC_CTX_INIT on wrong node
Sebastien Buisson [Thu, 8 Feb 2024 12:44:21 +0000 (13:44 +0100)]
LU-17484 gss: reply error for SEC_CTX_INIT on wrong node

When a server receives a SEC_CTX_INIT request for a target that is not
available (either stopping, or not set up yet, or moved to a failover
node), the request gets dropped. This makes the client-side RPC time
out, increasing the time it takes to establish a proper gss context
with the target, because it slows down the HA mechanism that tries
alternate failover NIDs.
Instead of dropping the request reply for SEC_CTX_INIT, the server
needs to send back a proper error reply. The client will then be able
to immediately try alternate failover NIDs, speeding mount/reconnect
process up, and avoiding potential eviction.

Lustre-change: https://review.whamcloud.com/53970
Lustre-commit: 3d635dd3f24421c181aca5673cd81ed8f3e2c622

Test-Parameters: trivial
Test-Parameters: kerberos=true testlist=sanity-krb5
Test-Parameters: testgroup=review-dne-selinux-ssk-part-2
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Id2cefaa7d54729a63c7be13b65d7ace579bcaa78
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54157
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
15 months agoLU-17528 gss: cleanup gss api usage
Sebastien Buisson [Thu, 15 Feb 2024 08:58:16 +0000 (09:58 +0100)]
LU-17528 gss: cleanup gss api usage

The lucid context support has been available from at least
krb5 1.7, and even RHEL7 ships with a more recent version.
So drop support for non-lucid api, and cleanup gss api usage.

Lustre-change: https://review.whamcloud.com/54063
Lustre-commit: 79a2d8645a28de77c7406ba56889d3a0749b851c

Test-Parameters: trivial
Test-Parameters: kerberos=true testlist=sanity-krb5
Test-Parameters: testgroup=review-dne-selinux-ssk-part-2
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I91fb706d2444c199156423b57a8c1ef24a0c3420
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54156
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
15 months agoLU-17535 gss: fix lsvcgssd crash in krb lib
Bruno Faccini [Tue, 13 Feb 2024 11:14:40 +0000 (12:14 +0100)]
LU-17535 gss: fix lsvcgssd crash in krb lib

This patch fixes some logic around the need to call
gss_delete_sec_context() or not vs kerberos implementations.

snd->ctx address instead of value should be passed to
serialize_context_for_kernel()/serialize_krb5_ctx() to
allow each implementation to clear it with GSS_C_NO_CONTEXT
if it has been destroyed internally, and cases where not
can also be handled in handle_krb() now.

Lustre-change: https://review.whamcloud.com/54023
Lustre-commit: f2705c4ec5598ca244bbb08673a1cfefd7342812

Test-Parameters: trivial
Test-Parameters: kerberos=true testlist=sanity-krb5
Test-Parameters: testgroup=review-dne-selinux-ssk-part-2
Signed-off-by: Bruno Faccini <bfaccini@nvidia.com>
Change-Id: I752712168a2c0f0a5a7a496b851d4cddbb7e4236
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54155
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
15 months agoLU-17226 build: create config option for l_getsepol
Gian-Carlo DeFazio [Thu, 16 Nov 2023 23:05:45 +0000 (15:05 -0800)]
LU-17226 build: create config option for l_getsepol

Add a configuration option for l_getsepol.
l_getsepol is build by default unless the --disable-l_getsepol
option is given to configure.
lustre.spec.in builds l_getsepol by default and has its
dependencies as build requirements.

The implicit configuration check for the dependency
openssl-devel is removed and replaced by a BuildRequires.

Lustre-change: https://review.whamcloud.com/52849
Lustre-commit: 2777adcabd1032ddb886f913fa04d82a292ab379

Test-Parameters: trivial
Signed-off-by: Gian-Carlo DeFazio <defazio1@llnl.gov>
Change-Id: If71a2a4a524047edbd2b31e6fac7a42f36a030bf
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54162
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoEX-9074 csdc: Provide finer grained enable_compression control
Artem Blagodarenko [Fri, 16 Feb 2024 16:50:08 +0000 (16:50 +0000)]
EX-9074 csdc: Provide finer grained enable_compression control

On all architectures other than aarch64 and ppc64le enable_compression
is now enabled by default. lfs warning message is gone.

To use CSDC on aarch64/ppc64le (on your own risk)
llite.*.enable_compression=1 should be set. lfs
set_stripe command still prints a warning message in this case.

Signed-off-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Test-Parameters: trivial
Change-Id: Ic8edc5bbeb8f9a3cd34ad3fc4e8c78e59f4cc34f
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53894
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Colin Faber <cfaber@ddn.com>
Reviewed-by: Patrick Farrell <paf0187@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoEX-8993 ofd: do not write 'hole' pages on compression
Patrick Farrell [Tue, 13 Feb 2024 16:45:17 +0000 (11:45 -0500)]
EX-8993 ofd: do not write 'hole' pages on compression

When doing unaligned read-modify-write to a compressed file,
we must round the IO lnb used for write in order to read up
the compressed data for modification.

In some cases, this creates a situation where there are
pages in the write lnb which have no data in them.  It is
important not to write out these pages, because if we do,
this wastes space and can cause incorrect file size.

In most cases, the file size is covered by the client
sending the file size, but if the client does not compress
a particular write, it does not send the size and the server
does not use it.  We could resolve this by having the client
always send size info and have the server always use it, but
it's better to make server writes 'hole' aware, since this
improves space usage.  (And this will be required for the
server to do recompression on read-modify-write, otherwise
no space is gained.)

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I66169e205fe4691ed03b2c9b3005ffc4ecd3213d
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53595
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-17258 socklnd: stop connecting on too many retries
Serguei Smirnov [Wed, 7 Feb 2024 18:48:08 +0000 (10:48 -0800)]
LU-17258 socklnd: stop connecting on too many retries

If peer repeatedly rejects connection requests with EALREADY,
assume that it doesn't support as many connections as we're trying
to create. Make sure to stop connecting to the peer altogether and
either continue with already created connections if there's at least
one of each type, or fail.

This helps avoid the assertion:

"ASSERTION( (wanted & ((((1UL))) << (3))) != 0 ) failed"

Lustre-change: https://review.whamcloud.com/53955
Lustre-commit: 02caf7170762d97dac4f367651addc7d90b6eb32

Test-Parameters: trivial testlist=sanity-lnet
Fixes: 5afe3b053 ("LU-17258 socklnd: ensure connection type established upon race")
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: I6072e91cc36544fc2f56c91cd78f6637cf82ecbc
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54014
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-17505 socklnd: return NETWORK_TIMEOUT to LNet on ETIMEOUT
Serguei Smirnov [Mon, 5 Feb 2024 23:27:15 +0000 (15:27 -0800)]
LU-17505 socklnd: return NETWORK_TIMEOUT to LNet on ETIMEOUT

Returning LNET_MSG_STATUS_LOCAL_TIMEOUT to LNet on ETIMEDOUT
causes LNet to only decrement the local NI health score,
while the issue may actually be with the remote NI.

Changing this to return LNET_MSG_STATUS_NETWORK_TIMEOUT
causes LNet to decrement both local NI and peer NI health.
If local NI is ok, it will recover its health score quickly,
but the affected peer NI health is lowered until peer NI is recovered.
This helps LNet select healthy NIs of the same peer in the meantime.

Lustre-change: https://review.whamcloud.com/53930
Lustre-commit: 099350d6e30218eb68d31cbfc7e9252a112e591f

Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: I916772477d1fd63571447262880a33830746f002
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53964
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>