Whamcloud - gitweb
fs/lustre-release.git
20 months agoLU-9121 lnet: select best peer and local net
Amir Shehata [Sat, 16 Feb 2019 01:59:40 +0000 (17:59 -0800)]
LU-9121 lnet: select best peer and local net

Select the healthiest and highest priority peer and local net when
sending a message.

Lustre-change: https://review.whamcloud.com/34352
Lustre-commit: dff6587805ddad212ab48e5bedacbc7846542b7b

Test-Parameters: trivial testlist=lnet-selftest,sanity-lnet
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I42717e7fdc3226c6faa7c59c713f18422e27f2e5
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52444
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
20 months agoLU-17088 dom: don't create different size DOM component
Bobi Jam [Tue, 5 Sep 2023 06:54:44 +0000 (14:54 +0800)]
LU-17088 dom: don't create different size DOM component

Multiple DOM components are allowed in diffrent mirror but they
must be of the same size, mirror extend should check this restraint.

Fix another glitch in lov_init_composite() where dom_size is used
as a __u64 value but declared as boolean.

Lustre-change: https://review.whamcloud.com/52269
Lustre-commit: e2539c0667525aff8d985d018c4ed077d95ba882

Fixes: 44a721b8c1 ("LU-11421 dom: manual OST-to-DOM migration via mirroring")
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Ia0d08c697dbeeb3aa8d20d9849226afa06360012
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52601
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
20 months agoLU-14637 flr: get rid of excluding dom+flr support test
Bobi Jam [Thu, 8 Jul 2021 14:34:18 +0000 (22:34 +0800)]
LU-14637 flr: get rid of excluding dom+flr support test

Now that DoM+FLR are supported, fix the tests that expect this
combination of features on a file to fail.

Lustre-change: https://review.whamcloud.com/44185
Lustre-commit: 4b52ea1d30b45900787271c4c035fad124abf34a

Fixes: 0bff64be320fd ("LU-9771 flr: to not support dom+flr for phase 1")
Fixes: 44a721b8c1063 ("LU-11421 dom: manual OST-to-DOM migration via mirroring)
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I9fc76e797e469744107e5d0453b78729226be0ee
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52600
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
20 months agoEX-8344 lipe: Update manual page
Vitaliy Kuznetsov [Mon, 9 Oct 2023 12:05:31 +0000 (14:05 +0200)]
EX-8344 lipe: Update manual page

This small patch expands the explanations for some
commands with information from the development files.
Adds one "todo-list.md" file instead of different files
with similar information.

Test-Parameters: trivial
Signed-off-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Change-Id: I0a3ebda49525d62cd6ca398f12601e588dc2dd42
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52589
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Colin Faber <cfaber@ddn.com>
Reviewed-by: Alexandre Ioffe <aioffe@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
20 months agoLU-17015 gss: support large kerberos token for rpc sec init
Sebastien Buisson [Thu, 7 Sep 2023 07:28:45 +0000 (09:28 +0200)]
LU-17015 gss: support large kerberos token for rpc sec init

If the current Kerberos setup is using large token, like when PAC
feature is enabled for Kerberos, authentication can fail due to server
side unable to exchange token between kernel and userspace.
This limitation is inherent to the sunrpc cache mechanism, that can
only handle tokens up to PAGE_SIZE.

For RPC sec init phase, use Lustre's upcall cache mechanism
instead of deprecated kernel's sunrpc cache. The upcall calls a new
userspace command 'l_getauth', that fowards the sec init request to
the lsvcgssd daemon via Unix domain sockets.

Lustre-change: https://review.whamcloud.com/52224
Lustre-commit: TBD (from 8acd059ee2b8d1e4c48c3d9dbb380bca75e1b3be)

Test-Parameters: kerberos=true testlist=sanity-krb5
Change-Id: I709cd79894a5a13fc4cdfab2109c86f2230db3b8
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52653
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
20 months agoLU-17015 build: rework upcall cache
Sebastien Buisson [Thu, 12 Oct 2023 14:45:29 +0000 (16:45 +0200)]
LU-17015 build: rework upcall cache

EX-4333 introduced in upcall_cache.c a dependency on md_object.h for
struct lu_ucred. Rework files to move this dependency to a differnt
file, so that upcall_cache.c can be built in client-only mode.

Fixes: fb0082bba1 ("EX-4333 sec: support supplementary groups from client")
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I4bcc7e07a4f4886c5994d17cbef72ea09eb1be1d
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52670
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
20 months agoLU-17015 gss: bump token buffer size to 16KiB
Sebastien Buisson [Fri, 22 Sep 2023 15:48:51 +0000 (17:48 +0200)]
LU-17015 gss: bump token buffer size to 16KiB

A 4 KiB large buffer is not enough to hold the GSS token under some
circumstances. So bump GSS_CTX_INIT_MAX_LEN value to 16 KiB.

Lustre-change: https://review.whamcloud.com/52475
Lustre-commit: TBD (from 43a540207da0198cc9c45b3c6312c555702b56cb)

Fixes: 9758129177 ("LU-17015 gss: support large kerberos token on client")
Test-Parameters: trivial kerberos=true testlist=sanity-krb5
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I8e72f1447593d2bf2ae537fcc920ceee20e93c09
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52628
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
20 months agoLU-13485 ldiskfs: Parallel configure tests for ldiskfs
Shaun Tancheff [Wed, 7 Dec 2022 02:42:33 +0000 (20:42 -0600)]
LU-13485 ldiskfs: Parallel configure tests for ldiskfs

Transform the compile tests in ldiskfs to run in parallel

Lustre-change: https://review.whamcloud.com/38351
Lustre-commit: 3774b6afbe3b67e869bb61c9cb212cc37e8705fa

Test-Parameters: trivial
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I3a097ab5cd18b57e9311980d9aa708ed25f58464
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52655
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
20 months agoLU-13485 libcfs: Remove unused iter_type check
Shaun Tancheff [Fri, 23 Sep 2022 05:27:14 +0000 (12:27 +0700)]
LU-13485 libcfs: Remove unused iter_type check

The iter_type member check is not used, remove it.

Lustre-change: https://review.whamcloud.com/48091
Lustre-commit: c755373c567090c49589e5aa0d3134847d4b952e

Test-Parameters: trivial
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I48d536a27738e73314feb88317d41d8479c72528
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52683
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
20 months agoLU-13485 lnet: Parallel configure tests for lnet
Shaun Tancheff [Mon, 3 Oct 2022 05:10:14 +0000 (12:10 +0700)]
LU-13485 lnet: Parallel configure tests for lnet

Transform the compile tests in lustre-lnet to run in parallel
Also fixes the generated Makefile to work with MOFED and in-kernel
OFED.

configure build times on an 8 core 8G vm vs current serial:

             serial      parallel
            --------     --------
    real    8m27.824s    1m28.375s
    user    5m29.448s    2m11.558s
    sys     3m48.258s    0m51.763s

Lustre-change: https://review.whamcloud.com/38368
Lustre-commit: fc84caa81b7fb9d27e82229d39f046e83b5ebb7e

Test-Parameters: trivial
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I4f0cb8584e1c3149ec3f005dd55fed0c47b50472
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52678
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
20 months agoEX-8038 csdc: sending compression info to server
Bobi Jam [Thu, 12 Oct 2023 14:44:17 +0000 (22:44 +0800)]
EX-8038 csdc: sending compression info to server

Client fills in layout compression info into obdo and passes it
to server.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Ieb5d7b3609da41f35f8622ed6116f19ce7567ddb
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52669
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
20 months agoRM-620 build: New tag 2.14.0-ddn106
Andreas Dilger [Fri, 6 Oct 2023 23:32:03 +0000 (17:32 -0600)]
RM-620 build: New tag 2.14.0-ddn106

New tag 2.14.0-ddn106

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ibc557d9c0a56d9994b7fed147e7e183a1b5528db

20 months agoRM-620 build: New tag lipe-2.33
Andreas Dilger [Fri, 6 Oct 2023 23:30:35 +0000 (17:30 -0600)]
RM-620 build: New tag lipe-2.33

New tag lipe-2.33

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I638f4b593856e9e8c15b180328be9cc23ec3365e

20 months agoLU-16837 llite: handle unknown layout component
Bobi Jam [Thu, 4 May 2023 01:56:12 +0000 (09:56 +0800)]
LU-16837 llite: handle unknown layout component

If lustre client encounters unknown layout component pattern in
a mirror file, this patch makes client mark this mirror as invalid
and skip it.

Lustre-change: https://review.whamcloud.com/51060
Lustre-commit: 14ed4a6f8f231fe94392906f991a32f07e7d7883

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Ie5f44212ab96bdc706cc5a9e11f330234fc01069
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51061
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
20 months agoEX-7948 utils: lamigo to track mirror progress
Alex Zhuravlev [Sat, 22 Jul 2023 09:51:56 +0000 (12:51 +0300)]
EX-7948 utils: lamigo to track mirror progress

pass --stats to lfs mirror/resync commands and then read
lfs's output over ssh channel.
this way we can keep ssh channel alive and interrupt
replication if it doesn't report progress.

the very first time agent is used lamigo checks whether
agent's lfs utility supports stats.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Iee5b43cb85dae62550d74667b16e00336f1bf52f
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51744
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexandre Ioffe <aioffe@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
20 months agoLU-9121 lnet: Select NI/peer NI with highest prio
Amir Shehata [Tue, 5 Sep 2023 19:29:55 +0000 (03:29 +0800)]
LU-9121 lnet: Select NI/peer NI with highest prio

Modify the selection algorithm to select the highest priority
local and peer NI. Health always trumps all other selection
criteria

Lustre-commit: 3fc2e0e0b3c8353a8fecc6d127ee55d255d7acb7
Lustre-change: https://review.whamcloud.com/34351

Test-Parameters: trivial testlist=lnet-selftest,sanity-lnet
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: I487a706f4da30311d0bd59fe03f72dbe68a52425
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52289
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
20 months agoLU-9121 lnet: foundation patch for selection mod
Amir Shehata [Tue, 5 Sep 2023 18:13:08 +0000 (02:13 +0800)]
LU-9121 lnet: foundation patch for selection mod

Add the priority and preferred NIDs fields in the lnet_ni,
lnet_net, lnet_peer_net and lnet_peer_ni. Switched
the implementation of the preferred NIDs list to list_head
instead of array, because the code is more straight forward.
There is more memory overhead due to list_head, but these lists
are expected to be small, so I chose code simplicity over memory.

Lustre-commit: 51b2c0f75f727f0562b3145015357cbff5cbb3b5
Lustre-change: https://review.whamcloud.com/34350

Test-Parameters: trivial testlist=lnet-selftest,sanity-lnet
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: I0c75855b736345c25e1604083eee2b65d38ef28d
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52288
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
20 months agoLU-16097 tests: skip quota subtests in interop
Andreas Dilger [Fri, 18 Aug 2023 21:55:10 +0000 (21:55 +0000)]
LU-16097 tests: skip quota subtests in interop

Skip subtests in sanity-quota.sh to avoid interop test failures,
backdated to check all new tests since 2.14.0 for completeness.

Test-Parameters: trivial testlist=sanity-quota serverversion=EXA6.1.0
Fixes: 513b1cdbca ("LU-16340 quota: notify only global lqe")
Fixes: d4978678b4 ("LU-15694 quota: keep grace time while setting default")
Fixes: 25a70a88c9 ("LU-13952 quota: default OST Pool Quotas")
Fixes: 188112fc80 ("LU-14300 quota: avoid nested lqe lookup")
Fixes: 8c19365416 ("LU-13971 quota: report Pool Quotas for a user")
Fixes: a4fbe7341b ("LU-14739 quota: nodemap squashed root cannot bypass quota")
Fixes: 789038c97a ("LU-15167 quota: fallocate send UID/GID for quota")
Fixes: c9901b68b4 ("LU-13587 quota: protect qpi in proc")
Fixes: 61ec1e0f2c ("LU-15031 quota: reseed glbe in qmt_lvbo_udate")
Fixes: dfe7d2dd2b ("LU-16341 quota: fix panic in qmt_site_recalc_cb")
Fixes: 862f0baa7c ("LU-15097 quota: stop pool_recalc before killing pool")
Fixes: 61481796ac ("LU-15193 quota: expand QUOTA_MAX_TRANSIDS to 12")
Fixes: a2fd4d3aee ("LU-15880 quota: fix insane grant quota")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Sergey Cheremencev <scherementsev@ddn.com>
Change-Id: Ife8bfd83d0f217c534f3b12b4c9d108d370ed6b7
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52009
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52582

20 months agoEX-8281 tests: Fix sanity test_56ab when CSDC is enabled
Wei Liu [Tue, 3 Oct 2023 23:13:14 +0000 (16:13 -0700)]
EX-8281 tests: Fix sanity test_56ab when CSDC is enabled

Use /dev/urandom in sanity test_56ab so the data cannot be compressed

Lustre-change: https://review.whamcloud.com/52572
Lustre-commit: TBD  (from 1ce661dd56fb4b6ecc9e909805c6101bbd9c3161)

Test-Parameters: trivial testlist=sanity env=ONLY=56ab

Signed-off-by: Wei Liu <sarah@whamcloud.com>
Change-Id: I0ceb9afcbdc8443b5e04dff486e41621479dbd23
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52501
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Colin Faber <cfaber@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
20 months agoLU-17152 tests: unmount NFS clients with zconf_umount_clients
Jian Yu [Tue, 3 Oct 2023 23:44:25 +0000 (16:44 -0700)]
LU-17152 tests: unmount NFS clients with zconf_umount_clients

This patch fixes cleanup_nfs() to unmount NFS clients by running
zconf_umount_clients(), which can find and kill active processes
that are accessing the NFS mount point so as to avoid the
"device is busy" failure.

The patch also adds racer_on_nfs test into always_except list for
parallel-scale-nfsv4 due to LU-17154.

Lustre-change: https://review.whamcloud.com/52533
Lustre-commit: TBD (from 52a2147e8b0eca74f38b1b87991b53ccf25663cd)

Test-Parameters: trivial testlist=parallel-scale-nfsv4

Change-Id: I37a38502362399540c28e78d1343e768b490ce8b
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52534
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
20 months agoEX-3860 llog: extended debug for -ENOTDIR error
Mikhail Pershin [Thu, 20 Jul 2023 10:27:59 +0000 (13:27 +0300)]
EX-3860 llog: extended debug for -ENOTDIR error

Debug patch to catch trace and debug log for -ENOTDIR
error in distribute_txn_cancel_records()

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: Ie1bb7c138282bfa05a2fafcceafdb436d45f28d3
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52394
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
20 months agoLU-13814 clio: add cp_inode to page allocation
Patrick Farrell [Fri, 29 Sep 2023 20:19:06 +0000 (21:19 +0100)]
LU-13814 clio: add cp_inode to page allocation

cp_inode can be set correctly during page allocation,
rather than after.  This is a prelude to moving cp_inode to
the osc_transfer_page, but that's better done in a separate
patch.

Lustre-change: https://review.whamcloud.com/52208
Lustre-commit: TBD (from f2afaf4eb10d70c36ad6bdbc2def66bee4fcdc23)
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Signed-off-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Change-Id: I509f6cfbae8e5a6ec6b07c8253d68f6dd2794e59
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52557
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
20 months agoEX-7601 osc: move common CSDC code to the library
Artem Blagodarenko [Sun, 3 Sep 2023 16:38:18 +0000 (17:38 +0100)]
EX-7601 osc: move common CSDC code to the library

CSDC repacks a chunk on the server side in case of the
partial rewrite. There are routines that can be shared
between client and server.

This patch moves common compression code to the
libcfs.

Signed-off-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Change-Id: I824211a3435b0479f7a3b8f08598a5b567b67d3c
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52262
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
20 months agoLU-17087 lmv: update stale tgt statfs every 1 hour
Lai Siyao [Mon, 4 Sep 2023 12:45:34 +0000 (08:45 -0400)]
LU-17087 lmv: update stale tgt statfs every 1 hour

Some tgt statfs may not be initialized upon mount due to network
issues, if the filesystem is imbalanced, these tgts won't be chosen to
create directory because their bavail and ffree are 0.

If MDT is chosen by QoS, update tgt statfs that is one hour overdue,
otherwise check update the statfs of the tgt that is chosen.

Lustre-commit: e262e0ffbe792ae2f8b47ccdafac38a36151a300
Lustre-change: https://review.whamcloud.com/52270

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I06af8b8bd342f66cb794471df3ee0f3b127ffe05
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52560
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
20 months agoLU-17136 ldiskfs: increase max extent tree depth
Alex Zhuravlev [Fri, 22 Sep 2023 13:01:56 +0000 (16:01 +0300)]
LU-17136 ldiskfs: increase max extent tree depth

this is an workaround until LU-16843 ready

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I5829c10888bf32649fe7a7a72c8ee697647a89cc
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52540
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
20 months agoEX-8253 tests: skip sanity-scrub/4e until fixed
Andreas Dilger [Mon, 2 Oct 2023 22:35:48 +0000 (00:35 +0200)]
EX-8253 tests: skip sanity-scrub/4e until fixed

Subtest 18 is failing about 1/5 of sanity-scrub runs, after test_4e
was landed.  Disable test_4e to see if that fixes the issue.

Test-Parameters: trivial
Test-Parameters: testlist=sanity-scrub,sanity-scrub,sanity-scrub
Test-Parameters: testlist=sanity-scrub,sanity-scrub,sanity-scrub
Test-Parameters: testlist=sanity-scrub,sanity-scrub,sanity-scrub
Test-Parameters: testlist=sanity-scrub,sanity-scrub,sanity-scrub
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ia8f1bd9dbf0fdbfabf79b1ead63a0421a8892c82
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52564
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
20 months agoRM-620 build: New tag 2.14.0-ddn105
Andreas Dilger [Mon, 2 Oct 2023 01:06:37 +0000 (03:06 +0200)]
RM-620 build: New tag 2.14.0-ddn105

New tag 2.14.0-ddn105

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I28dd2c8ac1550a3c6abb630a32e88423fdbf492e

20 months agoRM-620 build: New tag lipe-2.32
Andreas Dilger [Mon, 2 Oct 2023 01:06:16 +0000 (03:06 +0200)]
RM-620 build: New tag lipe-2.32

New tag lipe-2.32

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I4556f6234985973170f94c06c3572dbfce3204c1

20 months agoEX-8191 lipe: Fix test for --collect-fsize-stats in lipe3
Vitaliy Kuznetsov [Thu, 28 Sep 2023 13:29:47 +0000 (15:29 +0200)]
EX-8191 lipe: Fix test for --collect-fsize-stats in lipe3

This patch modifies the test for collecting statistics
in lipe3 and corrects:
1. Error getting a username if it doesnt already exist.
2. Error comparing file sizes after changing table
generation rules.
3. Converts the test from reading yaml to reading json
4. Now many files of different sizes are generated
for the test.
5. Now the data for comparison is retrieved from
the ls utility.
6. The test has added a check for creating a user with a
large UID, GID, which checks the availability of reports
for this user.

Test-Parameters: trivial testlist=sanity-lipe-scan3
Signed-off-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Change-Id: I7d0bdcc407bc0d27441c4204511dab2e6a421a5f
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52424
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
20 months agoLU-16954 llite: add SB_I_CGROUPWB on super block for cgroup
Qian Yingjin [Wed, 16 Aug 2023 04:02:22 +0000 (00:02 -0400)]
LU-16954 llite: add SB_I_CGROUPWB on super block for cgroup

Cgroup support can be enabled per super_block by setting
SB_I_CGROUPWB in ->s_iflags.
Cgroup writeback requires support from both the bdi and
filesystem.
This patch adds SB_I_CGROUPWB flag on super block for Lustre.
This is required by the subsequent patch series to support
cgroup in Lustre.

Adding this flags for Lustre super block will cause the remount
failure on Maloo testing on Unbutu 2204 v5.15 kernel due to the
duplicate filename (sysfs) for bdi device.
To avoid remount failure, we explicitly unregister the sysfs for
the @bdi.

Lustre-change: https://review.whamcloud.com/51955
Lustre-commit: dcc1dd39a67f15de9174e7acdda599e3c54c1421

Test-Parameters: clientdistro=ubuntu2204 testlist=sanity-sec
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I7fff4f26aa1bfdb0e5de0c4bdbff44ed74d18c2d
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52538
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
20 months agoLU-17133 kernel: update SLES15 SP4 [5.14.21-150400.24.84.1]
Jian Yu [Thu, 28 Sep 2023 18:24:12 +0000 (11:24 -0700)]
LU-17133 kernel: update SLES15 SP4 [5.14.21-150400.24.84.1]

Update SLES15 SP4 kernel to 5.14.21-150400.24.84.1 for Lustre client.

Lustre-change: https://review.whamcloud.com/52481
Lustre-commit: TBD (from 5dcdbe687d136d7e976f578faccbb3bde1b0acc9)

Test-Parameters: trivial clientdistro=sles15sp4 testlist=sanity

Change-Id: I5bce1642fc5bd212fd89dd65d9e1beb32ccd744d
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52546
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
20 months agoLU-17010 lfsck: don't dump stack repeatedly
Andreas Dilger [Wed, 13 Sep 2023 05:12:18 +0000 (23:12 -0600)]
LU-17010 lfsck: don't dump stack repeatedly

If there are transactions started with LFSCK in dry-run mode, don't
dump the stack repeatedly, as this can spam the console logs and
significantly hurt performance.

Lustre-commit: dc360cd3eff20618f243ab89097a62f8ecf2c929
Lustre-change: https://review.whamcloud.com/52356

Test-Parameters: trivial testlist=sanity-lfsck
Fixes: 0c1ae1cb9c ("LU-13124 scrub: check for multiple linked file")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I0b0d64911453dc8ab947e284656311b5d0300c1e
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52541
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
20 months agoLU-17095 build: avoid modules.order nonexistence failure
Jian Yu [Thu, 28 Sep 2023 17:40:45 +0000 (10:40 -0700)]
LU-17095 build: avoid modules.order nonexistence failure

The modules.order is a temporary output file generated by
kbuild while running "make" command. Sometimes, there is
a race condition that causes the file not created and makes
make command fail as follows:

cat: ...//modules.order: No such file or directory

This patch creates an empty modules.order file to avoid
the error.

Lustre-change: https://review.whamcloud.com/52323
Lustre-commit: dbe4f860977455a9abe50165645a025bb6c46350

Test-Parameters: trivial

Change-Id: If779a727731f18e9409c35c0cd0deddd79559d3a
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52544
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
20 months agoEX-8245 ptlrpc: always do vmalloc
Patrick Farrell [Thu, 28 Sep 2023 00:09:03 +0000 (20:09 -0400)]
EX-8245 ptlrpc: always do vmalloc

If we were ever to do an allocation with kmalloc, we could
get non-page aligned memory.  So just use vmalloc directly.

Sadly, this isn't the problem with infiniband.  We never
ask for < 8192, which is the libcfs kmalloc/vmalloc cutoff.

Still, this is a timebomb if we ever changed the libcfs
kmalloc/vmalloc cutoff, so, fix it.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Id20898065b516d363d9dc280e71be1b5cfb6f4a7
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52532
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
20 months agoEX-7342 revert: "test: remove extra cleanup and qp check"
Andreas Dilger [Sun, 1 Oct 2023 03:45:26 +0000 (03:45 +0000)]
EX-7342 revert: "test: remove extra cleanup and qp check"

This reverts commit 1dbe9be20011893ca46ccbbd2676e8063af4158d.
This causes 100% sanity-quota timeouts in test_79.

Test-Parameters: trivial testlist=sanity-quota
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I77a168bde4b53b69a197a4036b31b36f792ebae3
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52561
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
20 months agoRM-620 build: New tag 2.14.0-ddn104
Andreas Dilger [Thu, 28 Sep 2023 08:50:44 +0000 (02:50 -0600)]
RM-620 build: New tag 2.14.0-ddn104

New tag 2.14.0-ddn104

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I118f71235f7200f1e880424d5e7ac43334186ba3

20 months agoLU-16699 osc: Prefer NR_ZONE_WRITE_PENDING
Shaun Tancheff [Sun, 2 Apr 2023 16:33:44 +0000 (11:33 -0500)]
LU-16699 osc: Prefer NR_ZONE_WRITE_PENDING

Linux commit v4.7-5966-g5a1c84b404a7
 mm: remove reclaim and compaction retry approximations

Introduced NR_ZONE_WRITE_PENDING which should be used
in mod_zone_page_state.

Older kernels should fallback to NR_UNSTABLE_NFS
or NR_WRITEBACK.

Lustre-change: https://review.whamcloud.com/50499
Lustre-commit: d4094475c990d6ee8bf9e6e32a93f7c86a78f57a

Test-Parameters: trivial
HPE-bug-id: LUS-11559
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I90f22d4bd56f5986eaa5d4a042a2c8ed31fbf752
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52526
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
20 months agoLU-16671 osc: fix unstable pages for short IO
Patrick Farrell [Tue, 28 Mar 2023 15:02:40 +0000 (11:02 -0400)]
LU-16671 osc: fix unstable pages for short IO

Unstable pages was written with theoretical support for
short IO (ie, no bulk, data-in-rpc, LU-1757), but since the
short IO code wasn't merged until years later, they were
probably never tested together.  And when you do, it
crashes.

In truth, short IO has no separate pages to be tracked,
which is why this is crashing.  This means that small write
RPCs won't be tracked in unstable pages, but that's a very
minor limitation and unlikely to cause trouble.  (and since
RPC allocations are not 'pages', they're just malloc'ed,
there's no good way to track them anyway)

Lustre-change: https://review.whamcloud.com/50451
Lustre-commit: 4ba4976f525e957ef4c3ca7981bea01f72109ed6

Fixes: 70f092a ("LU-1757 brw: add short io osc/ost transfer.")
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I34b09f8324424c3ff0b0c09c86f01c938b643e37
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52524
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
20 months agoLU-17015 obdclass: new primitives for upcall cache
Sebastien Buisson [Fri, 15 Sep 2023 11:23:19 +0000 (13:23 +0200)]
LU-17015 obdclass: new primitives for upcall cache

This patch adds 2 new primitives to the upcall cache mechanism:
- upcall_cache_get_entry_raw: get a ref on an existing entry;
- upcall_cache_update_entry: modify expiry time and state of an entry.

Lustre-change: https://review.whamcloud.com/52389
Lustre-commit: 2ddb1d33245c23c4cafe64fb917323bdf567c81f

Test-Parameters: trivial
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I4825f09ae807abb52ebe0e24719dcd915e8c8aef
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52497
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
20 months agoEX-8277 llite: set max compression size to 64 MiB
Patrick Farrell [Thu, 21 Sep 2023 21:17:36 +0000 (17:17 -0400)]
EX-8277 llite: set max compression size to 64 MiB

Compression size should never be larger than RPC size, so
set it to a maximum of 64 MiB.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ia5958db3504f4f442fbd41e48416924debc26192
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52466
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
20 months agoEX-7342 test: remove extra cleanup and qp check
Hongchao Zhang [Fri, 22 Sep 2023 03:22:17 +0000 (23:22 -0400)]
EX-7342 test: remove extra cleanup and qp check

The test_79 in sanity-quota needs quota pool support, and
the cleanup of the "stop file" is also included in the
stack_trap, then it is no need to to do it explicitly.

Test-Parameters: trivial
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Change-Id: If86a1d0187b4b95d0c5e24f11f5f058280726e64
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52472
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
20 months agoEX-8189 osc: do not compress resends
Patrick Farrell [Fri, 22 Sep 2023 22:33:14 +0000 (18:33 -0400)]
EX-8189 osc: do not compress resends

There's some issue with doing compression on resent
requests, so this patch works around it with two things:
1. Use the uncompressed page array for resend
(this was always necessary unless we modified resend to
know it already had compressed pages as input)
2. Disable compression on resend (not clear why 1. wasn't
enough)

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I5fbbdc2771f8c2c7b5c28f0b70d89b8b6015147f
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52484
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
20 months agoLU-16973 osd: adds SB_KERNMOUNT flag
Alex Zhuravlev [Mon, 25 Sep 2023 09:20:41 +0000 (12:20 +0300)]
LU-16973 osd: adds SB_KERNMOUNT flag

During umount mntput() is called. It uses delayed_mntput()
function, and it could take much time to finish. A block
device is occupied during delayed work.

[ 8753.941980] Lustre: server umount XXX complete
[ 8800.129136] sysrq: SysRq : Trigger a crash

PID: 319306   TASK:XXXX   CPU: 2    COMMAND: "kworker/2:0"
 #0 __schedule at ffffffff9754e1d4
 #1 preempt_schedule_common at ffffffff9754e6fa
 #2 _cond_resched at ffffffff9754e72d
 #3 invalidate_mapping_pages at ffffffff96e72da5
 #4 invalidate_bdev at ffffffff96f5d13c
 #5 ldiskfs_put_super at ffffffffc1c82e34 [ldiskfs]
 #6 generic_shutdown_super at ffffffff96f1bdcc
 #7 kill_block_super at ffffffff96f1bed1
 #8 deactivate_locked_super at ffffffff96f1b784
 #9 cleanup_mnt at ffffffff96f3b86b

Let's use SB_KERNMOUNT flag during mount, it leads to
synchronous mntput().
It also calls flush_delayed_fput during umount to finish
delayed fput.

Lustre-change: https://review.whamcloud.com//51731
Lustre-commit: eff11c8ce1f89f30dcc5af88b67b3d6c15a631a6

Change-Id: Ia6729f6cbac85c3626562e946a4b96665a143714
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52495
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
20 months agoLU-16218 utils: add component flags "prefrd" and "prefwr"
Jian Yu [Wed, 27 Sep 2023 07:02:15 +0000 (00:02 -0700)]
LU-16218 utils: add component flags "prefrd" and "prefwr"

The initial implementation of "lfs setstripe ... --comp-flags=prefer"
only allowed specifying a single "prefer" argument for a given
mirror component, which would set both the "LCME_FL_PREF_RD" and
"LCME_FL_PREF_WR" flags at the same time.

This patch adds the separated component flags "prefrd" and "prefwr"
to allow setting the individual flags on a component.

Lustre-change: https://review.whamcloud.com/52508
Lustre-commit: TBD (from a4cd76c790c46b4bf6d85e386b1054d4b925e095)

Test-Parameters: trivial testlist=sanity-flr

Change-Id: I3e413cb37fab7ab2834946536705ce61a3feeed4
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52525
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Colin Faber <cfaber@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
20 months agoLU-16896 flr: resync should not change file size
Bobi Jam [Thu, 24 Aug 2023 01:43:04 +0000 (09:43 +0800)]
LU-16896 flr: resync should not change file size

mirror resync could punch a hole reaching the end of file in a
mirror, which could change the file size when the mirror is referred.

This patch calls truncate after punch in this case to keep the file
size unchanged in the mirror.

Lustre-change: https://review.whamcloud.com/51344
Lustre-commit: b9ce342ee196af48d2d25e2811121fe4471f5fd2

Also pick up commit 4cd4bfba473fb370767e1f2014d9fe1531889f82
("LU-16813 utils: move mirror_end initialization) to move
initialization for mirror_end variable in llapi_mirror_resync_many(),
otherwise lfs mirror resync may fail since mirror_end gets reset on
each pass of the loop.

Lustre-change: https://review.whamcloud.com/50919
Lustre-commit: 4cd4bfba473fb370767e1f2014d9fe1531889f82

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Ia0fc1f220a32a60f3516c69e86867796ae5c35c7
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52061
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
20 months agoLU-11912 tests: consume precreated objects in parallel
Andreas Dilger [Tue, 13 Jun 2023 07:02:22 +0000 (01:02 -0600)]
LU-11912 tests: consume precreated objects in parallel

Run the force_new_seq_all() file creations to run in parallel, since
this can take a significant amount of time when there are multiple
MDTs and OSTs (up to 1000s for 4x MDTs and 8x OSTs).

Lustre-change: https://review.whamcloud.com/51292
Lustre-commit: 656fc937cfd3fc3b65cb21a7f93a6bd4cc07fc0e

Test-Parameters: trivial testlist=replay-dual mdscount=2 mdtcount=4
Test-Parameters: testlist=replay-ost-single mdscount=2 mdtcount=4
Test-Parameters: testlist=replay-single mdscount=2 mdtcount=4
Test-Parameters: testlist=sanity-pfl env=ONLY="0 1 16 27" mdscount=2 mdtcount=4
Fixes: 2fdb1f8d01b9f ("LU-11912 tests: SEQ rollover fixes")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I849370586fe320d1f7df069f0b83980449658d97
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51496
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
20 months agoLU-11912 ofd: reduce LUSTRE_DATA_SEQ_MAX_WIDTH
Li Dongyang [Mon, 22 Nov 2021 11:43:03 +0000 (22:43 +1100)]
LU-11912 ofd: reduce LUSTRE_DATA_SEQ_MAX_WIDTH

Reduce LUSTRE_DATA_SEQ_MAX_WIDTH from ~4B to ~32M
to limit the number of objects under /O/[seq]/d[0..31]
dir on OSTs.
This makes the directories stay optimial for ldiskfs,
to avoid going into the largedir/3-level htree territory.

Remove the hard-coded LUSTRE_DATA_SEQ_MAX_WIDTH checks
in ofd, make them check the seq->lcs_width which is
a tunable set to LUSTRE_DATA_SEQ_MAX_WIDTH by default,
allow the value up to IDIF_MAX_OID if a larger seq width
is needed.

Use the odbo->o_size in the OST_CREATE rpc reply on ofd,
to update osp with the current seq width setting.
osp then uses this seq width to determine when to rollover
to a new seq.

The seq will rollover when the seq width is exhausted,
the default is LUSTRE_DATA_SEQ_MAX_WIDTH.
For seq >= FID_SEQ_NORMAL objects, the upper limit of
seq width is OBIF_MAX_OID,
For IDIF/MDT0 objects, the upper limit is IDIF_MAX_OID.
The seq FID_SEQ_OST_MDT0 will change to a normal seq after the
rollover.

Fix osp_precreate_reserve when the last precreated is the end
of the seq and the osp_objs_precreated can not host all
the requested objects, the mdt thread would stuck:
it wakes up osp precreate thread in a loop for progress,
but osp thread will not try to do anything until the seq
is used up. This can be seen easier when seq->lcs_width is
set to a low number and try to create an overstripe with stripe
number bigger than seq->lcs_width.

Fix the precreate thread spinning when the precreate pool
is at the end of the seq, and is nearly empty.

Change the seq->lcs_width to 16384 for all tests in
test-framework.sh, except a few slow tests to avoid timeouts,
and some overstriping tests creating LOV_MAX_STRIPE_COUNT to
avoid overstriping creating less objects than expected,
when precreate pool is at the end of the seq, and there are
not enough objects.

Fix the problem where seq could still change after
replay_barrier. To achieve this, introduce new fail_loc
OBD_FAIL_OSP_FORCE_NEW_SEQ and force_new_seq/force_new_seq_all
to drain the objects in the precreate pool then rollover to a
new seq. This applies to a bunch of test suites heavily using
replay_barrier.

Lustre-change: https://review.whamcloud.com/38424
Lustre-commit: 0ecb2a167c56ffff8e4fcb5cf576fb8c5d9e64fe

LU-14692 tests: wait for osp in conf-sanity/84

Wait for osp to change the first IDIF SEQ to a
normal SEQ, before using replay_barrier.
Otherwise the SEQ change could get lost and we
will trigger LASSERT during replay.

Lustre-change: https://review.whamcloud.com/50477
Lustre-commit: a9b7d73964b8b655c6c628820464342309f11356

Change-Id: I2749c1004b7bf3197b691cc94527f90145bcdef8
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sergey Cheremencev <scherementsev@ddn.com>
LU-11912 tests: SEQ rollover fixes

To avoid changeing SEQ after replay_barrier, we
use force_new_seq when starting the test suites heavily
using replay_barrier, e.g. replay-single.
However when there are fewer OSTs, the default 16384
SEQ width could not last the entire test suite, SEQ
rollover could still happen randomly after replay_barrier.

To overcome this, change the default OSTSEQWIDTH to
65536, and divide by number of OSTs, so the SEQ width is
larger with fewer OSTs. For 8 OSTs, the SEQ width is 16384,
and make sure we don't go under it.

Use force_new_seq_all for the test suites using replay_barrier
on MDTs other than mds1.

Add force_new_seq_all for replay-ost-single, which is using
replay_barrier on OST. If SEQ rollover happens after that,
the SEQ range update on ofd is lost due to replay_barrier,
the next time when we try to allocate a new SEQ will end up
with an old one.

Use force_new_seq_all for the test cases(namely sanity-pfl/0b
0c 1c 16b sanity/27Cd) checking for number of stripes created
with overstriping, to make sure we have enough objects
in the precreate pool.

Lustre-change: https://review.whamcloud.com/50478
Lustre-commit: 2fdb1f8d01b9f55f8270b48edc0e105e40d42f55

Test-Parameters: ostcount=4 testlist=replay-single
Test-Parameters: ostcount=2 testlist=replay-single
Test-Parameters: mdtcount=2 testlist=conf-sanity env=ONLY=122a,ONLY_REPEAT=10
Test-Parameters: testlist=sanity,sanity-pfl
Test-Parameters: testlist=sanity-scrub,replay-single,obdfilter-survey,replay-ost-single,large-scale
Fixes: 0ecb2a167c ("LU-11912 ofd: reduce LUSTRE_DATA_SEQ_MAX_WIDTH")
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Change-Id: I2749c1004b7bf3197b691cc94527f90145bcdef8
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/50760
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
20 months agoLU-14692 osp: deprecate IDIF sequence for MDT0000
Li Dongyang [Fri, 10 Dec 2021 11:44:09 +0000 (22:44 +1100)]
LU-14692 osp: deprecate IDIF sequence for MDT0000

Always return true for IDIF seq osp_fid_end_seq
so osp precreate will rollover to a new seq in
the FID_SEQ_NORMAL range for MDT0000.

Remove conf-sanity test_122b:
Check OST sequence wouldn't change when IDIF 32bit overflows

Lustre-change: https://review.whamcloud.com/45822
Lustre-commit: 6d2e7d191a7b27cde62b605dbed14488cfd4d410

Change-Id: I85a0e38266331c96d971d68ec353949ccac3fc21
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sergey Cheremencev <scherementsev@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/50758
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
20 months agoLU-10026 osd-ldiskfs: use preallocation for dense writes
Alex Zhuravlev [Tue, 27 Jun 2023 06:59:31 +0000 (09:59 +0300)]
LU-10026 osd-ldiskfs: use preallocation for dense writes

use inode's preallocation chunks as per-inode group preallocation:
just grab the very first available blocks from the window.

Lustre-change: https://review.whamcloud.com//50171
Lustre-commit: TBD (from 986340bcdfa572a1f6bab34014e0474c89f47691))

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I9d36701f569f4c6305bc46f3373bfc054fcd61a9
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51468
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Zhenyu Xu <bobijam@hotmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
20 months agoRM-620 build: New tag 2.14.0-ddn103
Andreas Dilger [Fri, 22 Sep 2023 23:58:31 +0000 (17:58 -0600)]
RM-620 build: New tag 2.14.0-ddn103

New tag 2.14.0-ddn103

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Iab8b19200c1ff9947b21c0e150ede242af5cede5

20 months agoLU-17128 build: fix lnet.service missing issue on Ubuntu 22.04
Jian Yu [Mon, 18 Sep 2023 16:03:16 +0000 (09:03 -0700)]
LU-17128 build: fix lnet.service missing issue on Ubuntu 22.04

The lnet.service file was in the lustre-client-utils package
built on Ubuntu 20.04, but it was missing from the package
built on Ubuntu 22.04.

This is caused by the dpkg-buildpackage change introduced by
dpkg version 1.21.1ubuntu2.1 installed by default on Ubuntu 22.04.
To fix this issue, we need to specify build profiles explicitly
to dpkg-buildpackage via -P|--build-profiles option instead of
just setting the environment variable DEB_BUILD_PROFILES.

Lustre-change: https://review.whamcloud.com/52404
Lustre-commit: TBD (from 59eef55fe08d761be91d9bce207d43f99769cf08)

Test-Parameters: trivial clientdistro=ubuntu2004
Test-Parameters: trivial clientdistro=ubuntu2204

Signed-off-by: Jian Yu <yujian@whamcloud.com>
Change-Id: I9975ef357f0aba722c56d27eaa9b2cfbccc9c524
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52405
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
20 months agoEX-7795 scripts: add whole file to compression scan
Patrick Farrell [Mon, 18 Sep 2023 21:12:46 +0000 (17:12 -0400)]
EX-7795 scripts: add whole file to compression scan

Add a mode where the compression scan script compresses the
entire file, which in theory should 100% match the
compression results from using CSDC and allow a test to
calculate the exact space usage reduction expected by
using CSDC.

This is intended to be used mostly for testing.

Change help documentation slightly to make clear this can
also accept a path to a single file.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I606a33d686d87dd631bf5b33dc85ee8c24fe9f67
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52406
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
20 months agoEX-8270 ptlrpc: rename 'pools' to 'ptr_pages'
Patrick Farrell [Thu, 21 Sep 2023 16:25:51 +0000 (12:25 -0400)]
EX-8270 ptlrpc: rename 'pools' to 'ptr_pages'

This finalizes the removal of the overloading of 'pools'
to also mean pointers of pages to items in each page pool.

This patch is currently the last in the series so it gets
full testing.  (This may change, but it's true now.)

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I0f4aba95f573f4afdc6f5d92f22fd67391fa6dab
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52461
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
20 months agoEX-8270 ptlrpc: rename npools to nptr_pages
Patrick Farrell [Thu, 21 Sep 2023 16:16:44 +0000 (12:16 -0400)]
EX-8270 ptlrpc: rename npools to nptr_pages

Continue removal of 'pool' as a name for a page of pointers
to items in a pool.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I97b320027a0a6b5870d246e1527fa3fbe15fccb5
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52460
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
20 months agoEX-8270 ptlrpc: rename max_pools to max_ptr_pages
Patrick Farrell [Thu, 21 Sep 2023 16:09:57 +0000 (12:09 -0400)]
EX-8270 ptlrpc: rename max_pools to max_ptr_pages

Continue removal of referring to page pointers as pools
with another rename.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I14796f670a7f06fbec3b40ec23b4dd2e50f22d46
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52459
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
20 months agoEX-8270 ptlrpc: change "pool" to "page_ptrs"
Patrick Farrell [Thu, 21 Sep 2023 16:05:16 +0000 (12:05 -0400)]
EX-8270 ptlrpc: change "pool" to "page_ptrs"

The page pool code *also* likes to refer to each page of
pointers it uses to track items in it as a "POOL", which is
incredibly confusing.

This patch works on renaming that to page_ptrs, but leaves
some steps for a future patch.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I56ee54c7f39b52d7cceffec9e3decf71bd313ddc
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52458
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
20 months agoEX-8270 ptlrpc: remove PAGES_PER_POOL macro
Patrick Farrell [Thu, 21 Sep 2023 15:47:59 +0000 (11:47 -0400)]
EX-8270 ptlrpc: remove PAGES_PER_POOL macro

The page pool code *also* likes to refer to each page of
pointers it uses to track items in it as a "POOL", which is
incredibly confusing.

Start unwinding this by removing the PAGES_PER_POOL macro.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ie29434f53eeb945b8d35df7c1212ae3f51a2aafa
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52457
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
20 months agoEX-8270 ptlrpc: remove PAGES_POOL macro
Patrick Farrell [Thu, 21 Sep 2023 15:15:12 +0000 (11:15 -0400)]
EX-8270 ptlrpc: remove PAGES_POOL macro

PAGES_POOL is just the order 0 pool now, so remove the
special naming, and adjust a few associated functions.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I09e1debeadecbce33c7be43a8859815084623358
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52456
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
20 months agoEX-8270 ptlrpc: rename INDEX macros
Patrick Farrell [Thu, 21 Sep 2023 15:09:17 +0000 (11:09 -0400)]
EX-8270 ptlrpc: rename INDEX macros

Rename INDEX macros to ORDER.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ic1123d25bc855dc7671c9cb587a0d6680662b729
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52455
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
20 months agoEX-8270 ptlrpc: rename ppp_index to ppp_order
Patrick Farrell [Thu, 21 Sep 2023 15:08:21 +0000 (11:08 -0400)]
EX-8270 ptlrpc: rename ppp_index to ppp_order

Rename ppp_index to ppp_order.

Other renames will be in a subsequent patch, to keep these
as simple as possible.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I96559e27a67b7cc4e56e06378e5686370438850c
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52454
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
20 months agoEX-8270 ptlrpc: begin renaming pool_index to order
Patrick Farrell [Thu, 21 Sep 2023 15:06:06 +0000 (11:06 -0400)]
EX-8270 ptlrpc: begin renaming pool_index to order

Replace local variables for pool_index with pool_order.

Other renames will be in a subsequent patch, to keep these
as simple as possible.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: If347ff39776f9a75c0f7d9d9981d01e19bc2cbc9
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52453
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
20 months agoEX-8270 ptlrpc: make {get,put}_pages take order
Patrick Farrell [Wed, 20 Sep 2023 22:02:36 +0000 (18:02 -0400)]
EX-8270 ptlrpc: make {get,put}_pages take order

Pool index and pool order are now the same thing.  Let's
start by changing {get,put} pages, then we'll flow the
change in to the rest of the code.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I1ad789b9c848a8ef601b21271d32fe0c2bb929c8
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52446
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
20 months agoEX-8270 ptlrpc: simplify pool arrays
Patrick Farrell [Thu, 21 Sep 2023 14:49:47 +0000 (10:49 -0400)]
EX-8270 ptlrpc: simplify pool arrays

Currently, we do a fancy trick where we have a pool of
order 0, then subsequent pools start at
PPOOL_MIN_CHUNK_BITS (which is actually the minimum
compresison size).

So pool index 1 isn't a pool of order 1 (2 pages), it's a
pool of order PPOOL_MIN_CHUNK_BITS.

All this saves us is the cost of the empty pools below
PPOOL_MIN_CHUNK_BITS, but it makes the code notably harder
to read.

With this change, the order of the pool and the pool index
are the same.  This simplification will be embraced more
in subsequent patches.

Test-Parameters: kerberos=true testlist=sanity-krb5
Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I650e05d25727f10b0ca2d556cba17e9c4fccc309
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52452
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
20 months agoEX-8270 ptlrpc: replace ELEMENT_SIZE
Patrick Farrell [Wed, 20 Sep 2023 21:55:42 +0000 (17:55 -0400)]
EX-8270 ptlrpc: replace ELEMENT_SIZE

The ELEMENT_SIZE macro is fine, but it takes a pool index
and doesn't handle the pool of order 0.  Change it to a
function.  (This is marginally less efficient in one spot,
since it replaces a shift with a divide, but it should be
just fine.)

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I322037e50bbdb8e0274b37f82618b6907b6d2906
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52445
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
20 months agoEX-8270 ptlrpc: refactor pool growing code
Patrick Farrell [Wed, 20 Sep 2023 21:26:31 +0000 (17:26 -0400)]
EX-8270 ptlrpc: refactor pool growing code

This refactors the pool growing code, combining two
separate instances of it in to a single function.

Test-Parameters: trivial
Test-Parameters: kerberos=true testlist=sanity-krb5
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I175abc7e61d55563e989f87207a8c59da852f5f9
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52443
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
20 months agoEX-8270 ptlrpc: convert to void
Patrick Farrell [Wed, 20 Sep 2023 18:43:51 +0000 (14:43 -0400)]
EX-8270 ptlrpc: convert to void

Convert functions without meaningful return to void.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I81f0baefd5b77b60ba699fa8749eaa83acadd8dd
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52438
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
20 months agoEX-8270 ptlrpc: stop passing around pool_index
Patrick Farrell [Wed, 20 Sep 2023 18:38:58 +0000 (14:38 -0400)]
EX-8270 ptlrpc: stop passing around pool_index

We pass pool_index around from function to function over
and over, but it's easier to just pass the pool around.

This does require the pool to know its own index, but
that seems better anyway.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I42dc8b8094212c69b7a29cc3766bd0a10860f7af
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52437
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
20 months agoEX-8270 ptlrpc: reduce usage of pool_index
Patrick Farrell [Wed, 20 Sep 2023 18:17:01 +0000 (14:17 -0400)]
EX-8270 ptlrpc: reduce usage of pool_index

The pool index is used over and over a lot of places where
we should just use it once.

Note the printing functions are deliberately not combined
to maximum length lines for ease of reading.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I7efbf491cf28f6fd16d06f5bbc42d714c908f34c
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52436
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
20 months agoEX-8270 ptlrpc: correct use of plural 'pools'
Patrick Farrell [Wed, 20 Sep 2023 17:57:29 +0000 (13:57 -0400)]
EX-8270 ptlrpc: correct use of plural 'pools'

There are a bunch of spots which refer to a single pool by
pool index, but which say 'pools'.  This is very confusing,
and in fact led to me misunderstanding the code at least
once.

Clean that up.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I9eabcfe77a57a82c87b36e3b3e040be91671fbfb
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52435
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
20 months agoEX-8270 ptlrpc: simplify pools_should_grow
Patrick Farrell [Wed, 20 Sep 2023 17:52:16 +0000 (13:52 -0400)]
EX-8270 ptlrpc: simplify pools_should_grow

This patch is a prelude to replacing "pools_should_grow()"
with a "grow_pool" function.  (The odd plural will be
removed shortly.)

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I0accbce2c36fa97684fbee364057b8ff2f9ae12d
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52434
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
20 months agoEX-8270 ptlrpc: improve use of 'count'
Patrick Farrell [Wed, 20 Sep 2023 17:40:22 +0000 (13:40 -0400)]
EX-8270 ptlrpc: improve use of 'count'

This is a first trivial step towards fixing usage of
'count' in the page pools code.  (And a whitespace fix.)

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ic4f74db74b8cec63572d5fd5b129f861ab0cba7c
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52433
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
20 months agoEX-8270 ptlrpc: remove more uses of 'enc'
Patrick Farrell [Wed, 20 Sep 2023 17:36:31 +0000 (13:36 -0400)]
EX-8270 ptlrpc: remove more uses of 'enc'

Remove a few more uses of 'enc' and note some we aren't
changing.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Iaaf6c23ea295b22ded2e8942227ebd5ce4d34e13
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52432
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
20 months agoEX-8270 ptlrpc: rename 'epp' to 'ppp'
Patrick Farrell [Sat, 16 Sep 2023 04:04:26 +0000 (00:04 -0400)]
EX-8270 ptlrpc: rename 'epp' to 'ppp'

Finish removing 'encryption' from page pool names except
for the module parameter, which is exposed in configuration
and so can't be changed.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I1c14f6cf8cf1a19d89b5a7787aac1b67203866d3
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52431
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
20 months agoEX-8270 ptlrpc: start removing 'enc' from pool
Patrick Farrell [Sat, 16 Sep 2023 03:58:38 +0000 (23:58 -0400)]
EX-8270 ptlrpc: start removing 'enc' from pool

Pools are no longer encryption page pools, start renaming
them accordingly.  (The 'epp' naming in the struct has been
left for the next patch.)

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Iba3c98641e24173d95bf8bcf0df2424bbabf3ef9
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52430
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
20 months agoEX-8270 ptlrpc: improve usage of PAGES_POOL
Patrick Farrell [Sat, 16 Sep 2023 03:48:15 +0000 (23:48 -0400)]
EX-8270 ptlrpc: improve usage of PAGES_POOL

PAGES_POOL isn't always used when it should be, let's
improve that a bit (and start renaming a function).

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ifed59db63d15d61d15712e6df6b8dbae56f2f5b7
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52429
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
20 months agoEX-8270 ptlrpc: rename get_buf to get_pages
Patrick Farrell [Sat, 16 Sep 2023 03:36:19 +0000 (23:36 -0400)]
EX-8270 ptlrpc: rename get_buf to get_pages

The sptlrpc_enc_pool_get_buf function actually gets a fixed
number of pages, which is sort of a buffer, but is better
understood as a set of pages.

Rename the function for getting pages for a ptlrpc desc so
we can give get_buf a more appropriate name.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I9c03b9d638e7df7f09bf5724c5a6896b7d1e7b6c
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52428
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
20 months agoEX-8270 ptlrpc: rename 'size_bits' to 'order'
Patrick Farrell [Sat, 16 Sep 2023 03:24:51 +0000 (23:24 -0400)]
EX-8270 ptlrpc: rename 'size_bits' to 'order'

The kernel uses 'order' to refer to page allocations of a
certain 'order', meaning 2^order pages.

That's what our 'size bits' is - an allocation of a certain
'order'.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I38b184239814a0f692b644566075c798ed16f816
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52427
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
20 months agoEX-8270 ptlrpc: rename 'pool' to 'pool_idx'
Patrick Farrell [Fri, 15 Sep 2023 17:36:20 +0000 (13:36 -0400)]
EX-8270 ptlrpc: rename 'pool' to 'pool_idx'

'pool' here is the index of the pool, not the pool itself.
Let's give it a name that makes clear it's a number and not
the actual pool.

Also remove an error condition which is asserted on
immediately before.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I636a4756a033d0b96a4772b8912f61c4b31b9c64
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52426
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
20 months agoEX-8270 osc: minor compression cleanups
Patrick Farrell [Fri, 15 Sep 2023 17:30:04 +0000 (13:30 -0400)]
EX-8270 osc: minor compression cleanups

This cleans up some style and argument issues I found made
the code a little harder to follow.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ia3492ae79acf6c83d724cc91b0201c7872325853
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52425
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
20 months agoLU-17132 kernel: update RHEL 8.8 [4.18.0-477.27.1.el8_8]
Jian Yu [Wed, 20 Sep 2023 19:31:44 +0000 (12:31 -0700)]
LU-17132 kernel: update RHEL 8.8 [4.18.0-477.27.1.el8_8]

Update RHEL 8.8 kernel to 4.18.0-477.27.1.el8_8.

Lustre-change: https://review.whamcloud.com/52422
Lustre-commit: TBD (from 4b2d932cdf9813e3fffafdd24f2ba14f02e95822)

Test-Parameters: trivial fstype=ldiskfs \
clientdistro=el8.8 serverdistro=el8.8 testlist=sanity

Test-Parameters: trivial fstype=zfs \
clientdistro=el8.8 serverdistro=el8.8 testlist=sanity

Change-Id: I4edd823b273c75618bc6dea236be8d64ed7c13ed
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52439
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
20 months agoLU-17112 kernel: update RHEL 7.9 [3.10.0-1160.99.1.el7]
Jian Yu [Wed, 20 Sep 2023 19:52:56 +0000 (12:52 -0700)]
LU-17112 kernel: update RHEL 7.9 [3.10.0-1160.99.1.el7]

Update RHEL 7.9 kernel to 3.10.0-1160.99.1.el7.

Lustre-change: https://review.whamcloud.com/52359
Lustre-commit: TBD (from 8a59bb388266e3d2e1a5683ed1d9a1dc2fbf822a)

Test-Parameters: trivial clientdistro=el7.9 serverdistro=el7.9

Change-Id: Iafb955b1927102fef4995b92d64218e36a4a8d51
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52440
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
20 months agoLU-17109 kernel: new kernel [SLES15 SP5 5.14.21-150500.55.22.1]
Jian Yu [Wed, 20 Sep 2023 19:04:25 +0000 (12:04 -0700)]
LU-17109 kernel: new kernel [SLES15 SP5 5.14.21-150500.55.22.1]

This patch makes changes to support new SLES15 SP5 release
with kernel 5.14.21-150500.55.22.1 for Lustre client.

Lustre-change: https://review.whamcloud.com/52340
Lustre-commit: TBD (from c410e3c89eadd728559782f94102f283ef52d63a)

Test-Parameters: trivial clientdistro=sles15sp5 testlist=sanity
Test-Parameters: trivial clientdistro=sles15sp4 testlist=sanity

Change-Id: I278017a5c996a8cf4e3d604aa928e968ca007312
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52342
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
20 months agoEX-8232 test: use client machines additionally to OSS
Alexandre Ioffe [Thu, 14 Sep 2023 01:07:14 +0000 (18:07 -0700)]
EX-8232 test: use client machines additionally to OSS

Additionally to OSS nodes add replication agents to client nodes.
This makes possible testing lamigo replications on large
number of nodes.

Test-Parameters: testlist=hot-pools
Signed-off-by: Alexandre Ioffe <aioffe@ddn.com>
Change-Id: I980f95a4885991faf7d958e98fdbc7811fb1f163
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52368
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
20 months agoLU-15671 ost: remove REP_MBITS from OST_CONNECT2_SUPPORTED
Andreas Dilger [Tue, 19 Sep 2023 18:07:26 +0000 (12:07 -0600)]
LU-15671 ost: remove REP_MBITS from OST_CONNECT2_SUPPORTED

Remove the OBD_CONNECT2_REP_MBITS flag from the OST_CONNECT2_SUPPORTED
mask on the OST that was accidentally included in a backported patch.
If newer clients that have support for REP_MBITS (e.g. 2.15.x) try to
recover with the 2.14.0-ddn91+ OSS, they will loop endlessly since
they are not exchanging the right information in the replay RPC/reply.

Test-Parameters: trivial
Fixes: b85a12aa73 ("LU-15671 mds: do not send OST_CREATE transno interop")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I6bef75900f8efdb8a1e35545a86c580a68f9ddc8
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52417
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
20 months agoLU-15193 quota: expand QUOTA_MAX_TRANSIDS to 12
Lei Feng [Thu, 4 Nov 2021 11:41:06 +0000 (19:41 +0800)]
LU-15193 quota: expand QUOTA_MAX_TRANSIDS to 12

In some rare cases 12 quota ids are needed.
Usually (user, group) * (block, inode) * (inode, parent) = 8 qids
are needed. But with project id,
(user, group, project) * (block, inode) * (inode, parent) = 12 qids
are needed.

Lustre-change: https://review.whamcloud.com/45456
Lustre-commit: I4b3ee197f6e274abda06edf60b246f089fe28d10

Signed-off-by: Lei Feng <flei@whamcloud.com>
Signed-off-by: Sergey Cheremencev <scherementsev@ddn.com>
Test-Parameters: trivial testlist=sanity-quota
Change-Id: I26bcf97cbb79caee6f76dd076e1a03cd9ce3d9c5
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52410
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
21 months agoLU-17015 obdclass: set cache entry/acquire expiry at init
Sebastien Buisson [Tue, 5 Sep 2023 09:08:16 +0000 (11:08 +0200)]
LU-17015 obdclass: set cache entry/acquire expiry at init

Give the ability to define values for cache entry expire and acquire
expire directly at upcall cache init.

Lustre-change: https://review.whamcloud.com/52271
Lustre-commit: TBD (from 2d24c820f32699d66b56024ae99a7b27944f6130)

Test-Parameters: trivial
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Iee0dea66943ab6747d85a378861ae98c29faa11a
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52370
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
21 months agoLU-17015 obdclass: make upcall cache hashtable size dynamic
Sebastien Buisson [Mon, 28 Aug 2023 09:37:51 +0000 (11:37 +0200)]
LU-17015 obdclass: make upcall cache hashtable size dynamic

The hash table used by the upcall cache mechanism should have an
adjustable size, depending on the purpose and context where it is
used.

Lustre-change: https://review.whamcloud.com/52128
Lustre-commit: 79f823bd40ee97a5846d828efce1080dc04a6057

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I53c5cb14f9a5630fc269d97cead9a5ca6a33895e
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Aurelien Degremont <adegremont@nvidia.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52369
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
21 months agoLU-17041 kernel: update RHEL 8.8 [4.18.0-477.21.1.el8_8]
Jian Yu [Fri, 18 Aug 2023 21:51:44 +0000 (14:51 -0700)]
LU-17041 kernel: update RHEL 8.8 [4.18.0-477.21.1.el8_8]

Update RHEL 8.8 kernel to 4.18.0-477.21.1.el8_8.

Lustre-change: https://review.whamcloud.com/52003
Lustre-commit: TBD (from 4268396e4ee6e33a91b11ba4d0f77838aa3c172a)

Test-Parameters: trivial fstype=ldiskfs \
clientdistro=el8.8 serverdistro=el8.8 testlist=sanity

Test-Parameters: trivial fstype=zfs \
clientdistro=el8.8 serverdistro=el8.8 testlist=sanity

Change-Id: Ie24c8e438dd33afafb900664d9a4010160bc1a45
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52008
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
21 months agoLU-17111 kernel: update RHEL 9.2 [5.14.0-284.30.1.el9_2]
Jian Yu [Wed, 13 Sep 2023 20:01:51 +0000 (13:01 -0700)]
LU-17111 kernel: update RHEL 9.2 [5.14.0-284.30.1.el9_2]

Update RHEL 9.2 kernel to 5.14.0-284.30.1.el9_2 for Lustre client.

Lustre-change: https://review.whamcloud.com/52358
Lustre-commit: TBD (from f6f135c77911707b4c7282fedb3973a6a16e0d7d)

Test-Parameters: trivial clientdistro=el9.2 testlist=sanity

Signed-off-by: Jian Yu <yujian@whamcloud.com>
Change-Id: Id80dbba6b4434a83cf925d6961d727941274edf4
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52365
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
21 months agoLU-16424 tests: Add version check in sanity-lnet
Wei Liu [Mon, 14 Aug 2023 19:02:24 +0000 (12:02 -0700)]
LU-16424 tests: Add version check in sanity-lnet

Skip sanity-lnet test_205, test_207, test_209 and test_254 if
version is older than 2.14.58 since the lnet_if_list
function was added in Fixes:
3166a201e0 ("LU-15398 tests: Use remote peers for health tests")

Lustre-change: https://review.whamcloud.com/c/fs/lustre-release/+/51942
Lustre-commit: ee4f470d590dd19d9c7d188958d9305ccd666e5e

Test-Parameters: trivial testlist=sanity-lnet \
serverjob=lustre-b_es5_2 serverbuildno=591 \
serverdistro=el7.9

Signed-off-by: Wei Liu <sarah@whamcloud.com>
Change-Id: I9cd62d91980784e3b33cf4e30426bf74d17f717f
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51942
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Colin Faber <cfaber@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52379

21 months agoEX-7681 scripts: Compression estimate script
Patrick Farrell [Thu, 15 Jun 2023 18:49:56 +0000 (14:49 -0400)]
EX-7681 scripts: Compression estimate script

ll_compression_scan is a simple tool which can be run on any
Linux system to estimate the space usage reduction from the
Lustre Client Side Data Compression (CSDC) feature with
particular compression settings (algorithm, chunk size,
and compression level).

When run on one or more directories, it will recursively
examine a percentage of files under that directory, sampling
data in those files to estimate how the files will compress.

This tools samples data throughout the file, so it should
avoid problems with poor estimates for files with headers
which differ from the bulk data in the file.

However, if the directory tree is particularly imbalanced,
with a few large uncompressible files in one directory, and
many small files in other directories, then scanning a small
percentage of files may give a misleading compression estimate.
Sampling a larger percentage of files will improve this.

This tool requires the lz4, lzop, and gzip utilities to
be installed in order to test those compression types.
(lzop is the command line utility for lzo compression)

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I092f9608553eba10bacfcc3c4a3fafc9a454c287
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51333
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Colin Faber <cfaber@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
21 months agoEX-7433 osc: disable CPU-access features for RDMA only pages
Patrick Farrell [Wed, 13 Sep 2023 15:36:58 +0000 (11:36 -0400)]
EX-7433 osc: disable CPU-access features for RDMA only pages

Pages which cannot be accessed by the CPU are referred to
as RDMA only pages.  If pages cannot be accessed by the
CPU, it is impossible for us to do compression,
encryption, checksums, or short-io (data-in-RPC) on them.

This patch disables compression and encryption for these
pages and cleans up the code so checksums and short-io
are disabled by the same code.

The only user of RDMA only pages today is Nvidia's GPU
direct, so this patch disables compression and
encryption with GPU direct.

NB: We eventually intend to handle compression for
GPU direct with server side compress/decompress.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Iad9311617cddf27d3ff75a17429499c573067ea0
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51770
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
21 months agoLU-16750 ldiskfs: optimize metadata allocation for hybrid LUNs
Bobi Jam [Mon, 11 Sep 2023 12:06:02 +0000 (20:06 +0800)]
LU-16750 ldiskfs: optimize metadata allocation for hybrid LUNs

With LVM it is possible to create an LV with SSD storage at the
beginning of the LV and HDD storage at the end of the LV, and use that
to separate ext4 metadata allocations (that need small random IOs)
from data allocations (that are better suited for large sequential
IOs) depending on the type of underlying storage.  Between 0.5-1.0% of
the filesystem capacity would need to be high-IOPS storage in order to
hold all of the internal metadata.

This would improve performance for inode and other metadata access,
such as ls, find, e2fsck, and in general improve file access latency,
modification, truncate, unlink, transaction commit, etc.

This patch split largest free order group lists and average fragment
size lists into other two lists for IOPS/fast storage groups, and
cr 0 / cr 1 group scanning for metadata block allocation in following
order:

if (allocate metadata blocks)
      if (cr == 0)
              try to find group in largest free order IOPS group list
      if (cr == 1)
              try to find group in fragment size IOPS group list
      if (above two find failed)
              fall through normal group lists as before
if (allocate data blocks)
      try to find group in normal group lists as before
      if (failed to find group in normal group && mb_enable_iops_data)
              try to find group in IOPS groups

Non-metadata block allocation does not allocate from the IOPS groups
if non-IOPS groups are not used up.

Add for mke2fs an option to mark which blocks are in the IOPS region
of storage at format time:

  -E iops=0-1024G,4096-8192G

so the ext4 mballoc code can then use the EXT4_BG_IOPS flag in the
group descriptors to decide which groups to allocate dynamic
filesystem metadata.

--
v2->v3: add sysfs mb_enable_iops_data to enable data block allocation
        from IOPS groups.
v1->v2: for metadata block allocation, search in IOPS list then normal
        list.

Lustre-change: https://review.whamcloud.com/51625
Lustre-commit: TBD (from 452f102a581f2a8ef8396bf0ba5584d61512a267)

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Ice2d25b8db19f67e70690f9ccebc419f253b12bd
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52121
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
21 months agoLU-14438 ldiskfs: backport ldiskfs mballoc patches
Bobi Jam [Fri, 21 Jul 2023 07:34:20 +0000 (15:34 +0800)]
LU-14438 ldiskfs: backport ldiskfs mballoc patches

This contains following kernel patches:

a078dff87013 ("ext4: fixup possible uninitialized variable access in
                     ext4_mb_choose_next_group_cr1()")
80fa46d6b9e7 ("ext4: limit the number of retries after discarding
                     preallocations blocks")
820897258ad3 ("ext4: Refactor code related to freeing PAs")
cf5e2ca6c990 ("ext4: mballoc: refactor
                     ext4_mb_discard_preallocations()")
83e80a6e3543 ("ext4: use buckets for cr 1 block scan instead of
                     rbtree")
a9f2a2931d0e ("ext4: use locality group preallocation for small
                     closed files")
1940265ede66 ("ext4: avoid unnecessary spreading of allocations among
                     groups")
4fca50d440cc ("ext4: make mballoc try target group first even with
                     mb_optimize_scan")
3fa5d23e68a3 ("ext4: reflect mb_optimize_scan value in options file")
077d0c2c78df ("ext4: make mb_optimize_scan performance mount option
                     work with extents")
196e402adf2e ("ext4: improve cr 0 / cr 1 group scanning")
21175ca434c5 ("ext4: make prefetch_block_bitmaps default")
3d392b2676bf ("ext4: add prefetch_block_bitmaps mount option")
cfd732377221 ("ext4: add prefetching for block allocation bitmaps")
4b68f6df1059 ("ext4: add MB_NUM_ORDERS macro")
dddcd2f9ebde ("ext4: optimize the implementation of
                     ext4_mb_good_group()")
a6c75eaf1103 ("ext4: add mballoc stats proc file")
67d251860461 ("ext4: drop s_mb_bal_lock and convert protected fields
                     to atomic")

Lustre-change: https://review.whamcloud.com/51472
Lustre-commit: TBD (from 8da59fc988f0cebcac10e8ef1faab1e4c913de03)

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I079dfb74bd743894934484803cedb683073e4d94
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52120
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
21 months agoRM-620 build: New tag 2.14.0-ddn102
Andreas Dilger [Thu, 14 Sep 2023 07:41:01 +0000 (01:41 -0600)]
RM-620 build: New tag 2.14.0-ddn102

New tag 2.14.0-ddn102

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I47ca9dbbb4c952facefa4de57331aab791d72ed2

21 months agoRM-620 build: New tag lipe-2.31
Andreas Dilger [Thu, 14 Sep 2023 07:40:38 +0000 (01:40 -0600)]
RM-620 build: New tag lipe-2.31

New tag lipe-2.31

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I8fe1d357b47950452f751d7e833f7e5a84c0a867

21 months agoEX-7290 lipe: lipe_find3 get attr warnings
Alexandre Ioffe [Wed, 6 Sep 2023 08:17:59 +0000 (01:17 -0700)]
EX-7290 lipe: lipe_find3 get attr warnings

Report each get attr error when command line
option --warnings=get-attr.
Count all get attr errors per attr type and report them at the end.
Exclude the 'trusted.link' when scanning an OST.

Test-Parameters: trivial testlist=sanity-lipe-scan3,sanity-lipe-find3,sanityn
Signed-off-by: Alexandre Ioffe <aioffe@ddn.com>
Change-Id: I5e9b226bd4046eddcf779ca06af0892589d447ac
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52292
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
21 months agoLU-10729 tests: replay-dual/23d to wait
Alex Zhuravlev [Fri, 19 Nov 2021 19:52:28 +0000 (22:52 +0300)]
LU-10729 tests: replay-dual/23d to wait

replay-dual/23d simulates a dropped reply for the executed
update, but previous tests can break this:
 - the update modifies remote llog
 - there can be another uptdate to that remote log
   (from the previous tests)
 - fail_loc (OBD_FAIL_UPDATE_OBJ_NET) is applied to the
   old update
 - the 23d's update gets stuck

so the test has to ensure there is no pending/in-flight
updates.

Lustre-change: https://review.whamcloud.com/45623
Lustre-commit: 63a19f6f666b9d18fede66ce8bcd2d799b5e0fa7

Test-Parameters: trivial testlist=replay-dual mdscount=2 mdtcount=4
Test-Parameters: testlist=replay-dual mdscount=2 mdtcount=4
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I3b60468d1f6f467006d5872ec62b81f57fa0423e
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52334
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
21 months agoLU-17108 nodemap: make map_mode available for default nm
Sebastien Buisson [Mon, 11 Sep 2023 15:31:09 +0000 (17:31 +0200)]
LU-17108 nodemap: make map_mode available for default nm

The map_mode property lets control the way mapping is carried out. It
is already available on regular nodemaps, to decide whether uids, gids
and/or projids will be mapped.
On the default nodemap, where it is not possible to define mappings,
the map_mode property will be taken into account when trusted is 0 and
deny_unknown is 0. Unmapped IDs will be left unchanged.

Lustre-change: https://review.whamcloud.com/52336
Lustre-commit: TBD (from 613ca001049887b1dc0cb2501f566c263ff7a006)

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I16a2f5cfda11a8435b56a00f3e97bdc70741c156
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52337
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
21 months agoLU-16534 build: Prefer timer_delete[_sync]
Shaun Tancheff [Wed, 13 Sep 2023 05:12:35 +0000 (22:12 -0700)]
LU-16534 build: Prefer timer_delete[_sync]

Linux commit v6.1-rc1-7-g9a5a30568697
  timers: Get rid of del_singleshot_timer_sync()
Linux commit v6.1-rc1-11-g9b13df3fb64e
  timers: Rename del_timer_sync() to timer_delete_sync()
Linux commit v6.1-rc1-12-gbb663f0f3c39
  timers: Rename del_timer() to timer_delete()

Prefer timer_delete_sync() to del_singleshot_timer_sync()
Prefer timer_delete_sync() to del_timer_sync()
Prefer del_timer() to timer_delete()

Provide del_timer and del_timer_sync when
timer_delete[_sync] is not available

Lustre-change: https://review.whamcloud.com/49922
Lustre-commit: 0ec89529ce14a1bb5af0c01ed86424a10e0e373c

Test-Parameters: trivial
HPE-bug-id: LUS-11470
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I4c946c315a83482dd0bd69e5e89f0302a67bf81c
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52357
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>