Whamcloud - gitweb
fs/lustre-release.git
15 months agoEX-8130 lipe: Add output for dir sizes stats
Vitaliy Kuznetsov [Tue, 5 Mar 2024 14:49:39 +0000 (15:49 +0100)]
EX-8130 lipe: Add output for dir sizes stats

This patch adds functions for displaying size statistics
for directories in the general report.
This patch adds support for *.out format only.

Test-Parameters: trivial
Signed-off-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Change-Id: Iaf70aa4d84295f1a1a297b00fa45f12fb98c7625
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53983
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexandre Ioffe <aioffe@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoEX-8130 lipe: Add entry point for dirs stats
Vitaliy Kuznetsov [Mon, 19 Feb 2024 15:44:35 +0000 (16:44 +0100)]
EX-8130 lipe: Add entry point for dirs stats

This pr adds a function that is an entry point for
collecting statistics about directory sizes.

Test-Parameters: trivial
Signed-off-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Change-Id: Ide0c6006e287f69a1de99a5578ceab0070ea383e
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53982
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexandre Ioffe <aioffe@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoEX-8130 lipe: Add key func for work with tree
Vitaliy Kuznetsov [Mon, 19 Feb 2024 15:37:42 +0000 (16:37 +0100)]
EX-8130 lipe: Add key func for work with tree

This patch adds two key functions to collect directory size
statistics, which contain the basic logic for adding
directories to memory and incrementing size counters.

Test-Parameters: trivial
Signed-off-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Change-Id: I06e03a6be1052b7178274835169cc41d044ca1ab
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53963
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexandre Ioffe <aioffe@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoEX-8130 lipe: Add helper functions for stats
Vitaliy Kuznetsov [Tue, 5 Mar 2024 14:32:25 +0000 (15:32 +0100)]
EX-8130 lipe: Add helper functions for stats

This patch adds several helper functions for working with
directory size statistics. Also add ls3_stats_rm_first_dir()
which remove the directory from the list to increase counters.

Test-Parameters: trivial
Signed-off-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Change-Id: I04807846b49d6fb0e476b8bf146ba337f80e3d5e
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53962
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexandre Ioffe <aioffe@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoEX-8130 lipe: Add func for working with paths
Vitaliy Kuznetsov [Mon, 19 Feb 2024 19:02:44 +0000 (20:02 +0100)]
EX-8130 lipe: Add func for working with paths

This patсh adds directory path processing helper functions
that will be used later to collect directory size statistics.

These functions set the stage for working on increasing the
size counters for each directory, along the entire chain in
the file or directory path.

Test-Parameters: trivial
Signed-off-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Change-Id: I6e7302e9771dce2933c6730a1117fec3bc2b0fda
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53961
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexandre Ioffe <aioffe@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoEX-8130 lipe: Directory scan size stats
Vitaliy Kuznetsov [Tue, 5 Mar 2024 14:52:10 +0000 (15:52 +0100)]
EX-8130 lipe: Directory scan size stats

This patch adds functionality for creating new
directories and expanding memory for new child
directories in memory. Adds a function to
initialize the starting directory.

Test-Parameters: trivial
Signed-off-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Change-Id: I3ff6a62ffd9d6535ed4434f517d1c93d6ae01b34
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53960
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoEX-8130 lipe: Add functions to create a rating
Vitaliy Kuznetsov [Mon, 19 Feb 2024 20:25:29 +0000 (21:25 +0100)]
EX-8130 lipe: Add functions to create a rating

This patch adds new functionality to directory statistics
for working with a ranking table for the largest
directories (like TOP 100).

The creation of a structure for storing the rating occurs
after the lipe_scan3 scan is completed. The number of
objects in the structure is determined before lipe_scan3
is launched by the default value or by the user in lipe_find3
via the -top-rating option and is not expanded while lipe_scan3
is running. Adding new objects to the heap works by the logic
of replacing the object with the smallest size in the heap with
a new object if its size is larger. Adding objects to the heap
occurs when printing the results about the directory sizes,
since only in this case do we know the final sizes of the directories.

Test-Parameters: trivial
Signed-off-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Change-Id: Ie4a449fe69022716232638e0f856a10850403831
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53959
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexandre Ioffe <aioffe@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoEX-8130 lipe: Add directory scan structs
Vitaliy Kuznetsov [Tue, 5 Mar 2024 14:21:21 +0000 (15:21 +0100)]
EX-8130 lipe: Add directory scan structs

This patch adds new structures to lipe_scan3 for collecting
and storing directory statistics, as well as initialization
and destroy functions. This patch is the first in a series
of patches that add functionality for collecting directory
statistics in lipe_find3.

Test-Parameters: trivial
Signed-off-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Change-Id: Ib14ce13677d93d1a53299501138e78c7b290793c
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53958
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexandre Ioffe <aioffe@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoRM-620 build: New tag 2.14.0-ddn137
Andreas Dilger [Sun, 3 Mar 2024 10:31:20 +0000 (03:31 -0700)]
RM-620 build: New tag 2.14.0-ddn137

New tag 2.14.0-ddn137

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Icd060c758aa849abcefb7517f0d120a679a5a1b5

15 months agoLU-17500 qmt: avoid "enforced bit set, but neither"
Sergey Cheremencev [Fri, 2 Feb 2024 20:07:00 +0000 (23:07 +0300)]
LU-17500 qmt: avoid "enforced bit set, but neither"

Don't call qmt_revalidate_qunit in qmt_set_with_lqe
as it is possible that lqe_enforced bit is not cleared
in case when hard and soft limits are setting to 0.
No reasons to recalculate qunit and edquot when we
set limits to 0. For the case when limits are changed,
qunit and edquot will be calculated below in "dirtied"
branch. So not reasons to do this 2 times.

Patch helps to avoid following error:
LustreError: 21362:0:(qmt_entry.c:746:qmt_adjust_qunit())
  $$$ enforced bit set, but neither hard nor soft limit are set

Lustre-change: https://review.whamcloud.com/53893
Lustre-commit: 7498e7c38dffe23752b03bf168f3b5419855b10b

Signed-off-by: Sergey Cheremencev <scherementsev@ddn.com>
Change-Id: I8f5d9630f43b66ae7ea2be0bf2c735a02e1f6299
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54185
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-17481 mdt: count all opens in mdt.*.md_stats
Yang Sheng [Thu, 1 Feb 2024 16:31:13 +0000 (00:31 +0800)]
LU-17481 mdt: count all opens in mdt.*.md_stats

Count all of opens for mdt. Also add a test case to
verify it.

Lustre-change: https://review.whamcloud.com/53880
Lustre-commit: 055f939979b20eb769803ecffd0caa53c440ad7d

Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: I2fa90cc2b4ce8d7d039736a5f40a70cbeb04bf8c
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54181
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
15 months agoLU-8130 obd: remove limit on client mounts
Andreas Dilger [Mon, 26 Feb 2024 20:15:21 +0000 (13:15 -0700)]
LU-8130 obd: remove limit on client mounts

Using the in-kernel rhashtable instead of cfs_hash_table
for obd->obd_uuid_hash has a side effect of limiting number
of elements in the hash table and thereby limits max number
of Lustre clients by 16384.

The patch raises the limit to 2^31 (rhashtable default).

Fixes: e40b008e88 ("LU-8130 obd: convert obd uuid hash to rhashtable")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I222a6d0d2789ea9d1bb3530b3619d08ec83ebbe5
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54186
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
15 months agoLU-17454 nodemap: allow mapping for root
Sebastien Buisson [Wed, 31 Jan 2024 14:40:44 +0000 (15:40 +0100)]
LU-17454 nodemap: allow mapping for root

Allow an id mapping for root, to match what is implemented for regular
users, with the following behavior:
- if admin property is set, root remains root.
- if admin property is not set, the idmap for '0' is taken into
  account.
- if admin property is not set and there is no idmap for '0' and
  deny_unknown property is not set, root is squashed to the squash
  uid/gid.
- if admin property is not set and there is no idmap for '0' and
  deny_unknown property is set, root is blocked.

Note that map_mode remains ignored for root. Also, capabilities are
not dropped for root when mapped, just like it is done for regular
users. If admins want to drop root capabilities, root must be
squashed.

sanity-sec test_15 is updated to test root mapping.

Lustre-change: https://review.whamcloud.com/53870
Lustre-commit: b4a336d0ce91c05ae48544b3fd2e56f0bcb0a8cf

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Id2e950b99e3b3ba27179408c647e1f7b7c49e32e
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54159
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
15 months agoLU-17465 nodemap: change squash default value to 65534
Sebastien Buisson [Tue, 23 Jan 2024 09:07:25 +0000 (10:07 +0100)]
LU-17465 nodemap: change squash default value to 65534

Initially, default values for nodemap.squash_uid/gid/projid were set
to 99, to match user 'nobody'. But on newer systems, nobody has
changed to 65534 and 99 no longer exists.
It is safe to use 65534 in all cases, as even on older systems it
exists and corresponds to 'nfsnobody'.

Lustre-change: https://review.whamcloud.com/53802
Lustre-commit: d4927da410525db5f0524d618da47a17fe9c7835

Test-Parameters: testlist=sanity env=ONLY=432 serverversion=EXA5
Test-Parameters: testlist=sanity env=ONLY=432 clientversion=EXA5
Test-Parameters: testlist=sanity-quota env=ONLY=75 serverversion=EXA5
Test-Parameters: testlist=sanity-quota env=ONLY=75 clientversion=EXA5
Test-Parameters: testlist=sanity-selinux env=ONLY=21 serverversion=EXA5
Test-Parameters: testlist=sanity-selinux env=ONLY=21 clientversion=EXA5
Test-Parameters: testlist=sanity-sec env=ONLY="7 8 9 10 11 12 13 14 15 16 18 19 20 21 22 23 24 25 26 27 32 33 34 35 36 55 61 64" serverversion=EXA5
Test-Parameters: testlist=sanity-sec env=ONLY="7 8 9 10 11 12 13 14 15 16 18 19 20 21 22 23 24 25 26 27 32 33 34 35 36 55 61 64" clientversion=EXA5
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I2e20fda0fdc0d5bfdf964a890bfbd0b54b943cf4
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53777
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
15 months agoLU-17357 mgc: wait for sptlrpc config log
Sebastien Buisson [Tue, 12 Dec 2023 16:49:49 +0000 (17:49 +0100)]
LU-17357 mgc: wait for sptlrpc config log

The sptlrpc config log is mandatory to establish connections to
targets with proper security context. So wait for its retrieval.

Add sanity-sec test_68 to exercise this, and improve test_32
for mgssec.

Lustre-change: https://review.whamcloud.com/53423
Lustre-commit: 4a3e428361a03b4bc777eddd466ba1ff8b72b51e

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I5352e926dc6a9a68db1224629c68a42b74bee8a4
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54160
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
15 months agoLU-17563 kernel: update SLES15 SP5 [5.14.21-150500.55.49.1]
Jian Yu [Fri, 1 Mar 2024 23:52:23 +0000 (15:52 -0800)]
LU-17563 kernel: update SLES15 SP5 [5.14.21-150500.55.49.1]

Update SLES15 SP5 kernel to 5.14.21-150500.55.49.1 for Lustre client.

Lustre-change: https://review.whamcloud.com/54240
Lustre-commit: TBD (from fb361e2001c2e7fd34faea82236d427861e16ade)

Test-Parameters: trivial mdtcount=4 mdscount=2 \
  clientdistro=sles15sp5 testlist=sanity

Test-Parameters: optional clientdistro=sles15sp5 testgroup=full-part-1
Test-Parameters: optional clientdistro=sles15sp5 testgroup=full-part-2
Test-Parameters: optional clientdistro=sles15sp5 testgroup=full-part-3

Change-Id: I23868ff25ae093a52f004e556789805a644832ac
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54244
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-17593 kernel: update RHEL 8.9 [4.18.0-513.18.1.el8_9]
Jian Yu [Fri, 1 Mar 2024 23:47:13 +0000 (15:47 -0800)]
LU-17593 kernel: update RHEL 8.9 [4.18.0-513.18.1.el8_9]

Update RHEL 8.9 kernel to 4.18.0-513.18.1.el8_9 for Lustre client.

Lustre-change: https://review.whamcloud.com/54238
Lustre-commit: TBD (from eee579bfc2f1e1a8c02e76c3a82701920b0703ff)

Test-Parameters: trivial mdtcount=4 mdscount=2 \
  clientdistro=el8.9 testlist=sanity

Test-Parameters: optional clientdistro=el8.9 testgroup=full-part-1
Test-Parameters: optional clientdistro=el8.9 testgroup=full-part-2
Test-Parameters: optional clientdistro=el8.9 testgroup=full-part-3

Change-Id: I2c928e4c08af278dacce1d1dc7a14fa77ffffa33
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54243
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-17561 kernel: update RHEL 9.3 [5.14.0-362.18.1.el9_3]
Jian Yu [Fri, 1 Mar 2024 23:37:58 +0000 (15:37 -0800)]
LU-17561 kernel: update RHEL 9.3 [5.14.0-362.18.1.el9_3]

Update RHEL 9.3 kernel to 5.14.0-362.18.1.el9_3 for Lustre client.

Test-Parameters: trivial mdtcount=4 mdscount=2 \
  clientdistro=el9.3 testlist=sanity

Test-Parameters: optional clientdistro=el9.3 testgroup=full-part-1
Test-Parameters: optional clientdistro=el9.3 testgroup=full-part-2
Test-Parameters: optional clientdistro=el9.3 testgroup=full-part-3

Lustre-change: https://review.whamcloud.com/54236
Lustre-commit: TBD (from 2bbdc9e49055de2eda43a6d4b745543f8e354740)

Change-Id: Iddfe57197d854e0be864c0ce64699f92fcc181d1
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54242
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-13805 osd: Implement unaligned DIO connect flag
Andreas Dilger [Fri, 1 Mar 2024 22:18:39 +0000 (15:18 -0700)]
LU-13805 osd: Implement unaligned DIO connect flag

Unupgraded ZFS servers may crash if they received unaligned
DIO, so we need a compat flag and a test to recognize those
servers.

This patch extracts server-side logic from two master patches
to improve interop testing, but does not implement client UDIO.

Lustre-change: https://review.whamcloud.com/51126
Lustre-commit: 0e6e60b1233b08952c338b2c4f121ef749a99f8b
Was-Change-Id: I5d6ee3fa5dca989c671417f35a981767ee55d6e2

Lustre-change: https://review.whamcloud.com/45616
Lustre-commit: 7194eb6431d2ef7245ef3b13394b60e220145187
Was-Change-Id: I7eeebf9a608f006c8095b95f0677adb99f19d640

Test-Parameters: trivial testlist=sanity env=ONLY=56 fstype=zfs
Test-Parameters: testlist=sanity env=ONLY=56 clientbuildno=4505 clientjob=lustre-master clientdistro=el8.8
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I8b987c00f741a884ba28c18309cc2f90baf4809a
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54239
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
15 months agoLU-13805 obd: Reserve unaligned DIO connect flag
Patrick Farrell [Wed, 9 Aug 2023 16:16:25 +0000 (12:16 -0400)]
LU-13805 obd: Reserve unaligned DIO connect flag

Unaligned DIO generally requires only client changes, but
an assert must be removed from ZFS servers for it to work
correctly.  This means we need a connect flag to recognize
whether or not a server running ZFS can safely use
unaligned DIO.

All OSTs will present this flag - to keep things simple -
but if the flag is not present, we'll still do unaligned
DIO to ldiskfs OSTs.

Actual implementation will be in another patch, this one
just creates the flag itself.

Lustre-commit: https://review.whamcloud.com/51075
Lustre-change: 4c96cbf89dba5e4bf8ddf98a18b72142c22a4289

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I8b149cc54f4fb11e64182c65f2fbb01f8a3d3868
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53708
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
15 months agoLU-16518 ptlrpc: fix clang build errors
Timothy Day [Wed, 21 Feb 2024 21:14:17 +0000 (13:14 -0800)]
LU-16518 ptlrpc: fix clang build errors

Fixed bugs which cause errors on Clang.

The majority of changes involve adding
defines for the 'ptlrpc_nrs_ctl' enum.
This avoids having to explicitly cast
enums from one type to another.

An unused variable 'req' was removed from
'nrs_tbf_req_get'. A 'strlcpy' in
'sptlrpc_process_config' was copying the
wrong number of bytes. Another variable,
'rc' in 'sptlrpc_lproc_init', seemed to
be neglected unintentionally; this was also
fixed.

Lustre-change: https://review.whamcloud.com/49859
Lustre-commit: 50f28f81b5aa8f8ad1c8585bd7e262910f936e50

Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: If994c625199b392198f944f9cd21bbf2142bce69
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54131
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-16962 build: parallel configure cleanup
Shaun Tancheff [Wed, 21 Feb 2024 20:24:53 +0000 (12:24 -0800)]
LU-16962 build: parallel configure cleanup

LC_REGISTER_SHRINKER_FORMAT_NAMED macro should use
  register_shrinker_format

Lustre-change: https://review.whamcloud.com/51670
Lustre-commit: 1e9d48625b9a99d651a2e96cf947b60723713304

LU-16962 build: parallel header checks

Add LB2_CHECK_LINUX_HEADER_SRC and LB2_CHECK_LINUX_HEADER_RESULT
macros to use for running header checks in parallel.

Migrate (most) header checks to parallel and run a subset
early as the results of those tests are required by other
configure tests.

Lustre-change: https://review.whamcloud.com/51673
Lustre-commit: 2e025641ef087f159ca000ff3c4acb3ce886b8a3

Test-Parameters: trivial
HPE-bug-id: LUS-11709
HPE-bug-id: LUS-11710
Fixes: 0006eb3644 ("LU-16328 llite: migrate_folio, vfs_setxattr")
Fixes: ca992899d5 ("LU-16351 llite: Linux 6.1 prandom, folios_contig, vma_iterator")
Fixes: 7fe7f4ca06 ("LU-16520 build: Move strscpy to libcfs common header")
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I0cb630d035a23edfa353040f4c0d25c46eb417d8
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54121
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-16957 build: Improve parallel --config-cache
Shaun Tancheff [Wed, 21 Feb 2024 20:22:16 +0000 (12:22 -0800)]
LU-16957 build: Improve parallel --config-cache

The parallel build should consider the configure cache before
adding tests to the parallel build pass.

Track the number of compile tests needed, skip the make when
no build tests are needed.

Also unify libcfs, core, and ldiskfs build passes to a single step.

Configure timings vs master

     master       master w/cache  |     patch         patch w/cache
 --------------   --------------- | ---------------  ----------------
 real  1m3.493s   real  0m34.024s | real  1m3.903s    real  0m8.404s
 user 1m34.587s   user  1m16.547s | user  1m37.191s   user  0m4.292s
 sys  0m35.119s   sys   0m22.687s | sys   0m35.297s   sys   0m5.514s

Lustre-change: https://review.whamcloud.com/51637
Lustre-commit: 0dfeed23d67fe5b3f283ec5b9671c94f0fe2303f

Test-Parameters: trivial
HPE-bug-id: LUS-11706
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I6696b350e8315190a67c1463435b18a87d45813e
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54130
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-16793 build: Enable compile tests to require <module>.ko
Shaun Tancheff [Wed, 21 Feb 2024 20:18:06 +0000 (12:18 -0800)]
LU-16793 build: Enable compile tests to require <module>.ko

Currently the build tests only demand a kernel api test
create an object (.o).

Cases that have a missing symbol export, directly or
indirectly, will generate an object file and fail to
generate a kernel module (.ko).

Enable tests to select the stricter criteria.

Lustre-change: https://review.whamcloud.com/50849
Lustre-commit: 581db5e89e0d690961e49278a7b50ecce78e5a22

Test-Parameters: trivial
Fixes: cc5594df3e ("LU-16759 o2ib: MOFED 5.5+ ib_dma_virt_map_sg")
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Iae481f1287023ea6c2432d147c497fa0a55fd689
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54129
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-17081 build: compatibility for 6.5 kernels
Shaun Tancheff [Fri, 16 Feb 2024 07:17:26 +0000 (23:17 -0800)]
LU-17081 build: compatibility for 6.5 kernels

Linux commit v6.4-rc2-29-gc6585011bc1d
  splice: Remove generic_file_splice_read()

Prefer filemap_splice_read and provide alternates for older kernels.

Linux commit v6.4-rc2-30-g3fc40265ae2b
  iov_iter: Kill ITER_PIPE

ITER_PIPE and iov_iter_is_pipe() are removed, provide a replacement
for iov_iter_is_pipe

Linux commit v6.4-rc4-53-g54d020692b34
  mm/gup: remove unused vmas parameter from get_user_pages()

Use vma_lookup() to acquire the vma following get_user_pages()

Linux commit v6.4-rc7-1884-gdc97391e6610
  sock: Remove ->sendpage*() in favour of sendmsg(MSG_SPLICE_PAGES)
Use sendmsg when MSG_SPLICE_PAGES is defined. Provide a wrapper
using sendpage() for older kernels.

Lustre-change: https://review.whamcloud.com/52258
Lustre-commit: 2bb54b6383d57ac61092593b9e6d9c80801263f5

HPE-bug-id: LUS-11811
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I95a0954a602c8db08d30b38a50dcd50107c8f268
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54055
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-17161 build: Avoid fortify_memset in OBD_FREE_PTR
Shaun Tancheff [Thu, 15 Feb 2024 01:26:00 +0000 (17:26 -0800)]
LU-17161 build: Avoid fortify_memset in OBD_FREE_PTR

OBD_FREE_PTR will optionally clear the about to be free()d
memory.

Unfortunately fortify_memset_chk() hits some false positives.

We can use __underlying_memset() if it is defined, to avoid
the fortify_memset_chk.

Lustre-change: https://review.whamcloud.com/52559
Lustre-commit: 58cc8cf98e37e9d8149d5f605a75d56f2cd4eb70

Test-Parameters: trivial
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Iced53f22b97ed90e0970625c4fcbaa404054c54a
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53956
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-16518 build: llvm/clang support
Timothy Day [Thu, 15 Feb 2024 00:43:19 +0000 (16:43 -0800)]
LU-16518 build: llvm/clang support

Other projects, notably Linux, have build support for LLVM and
Clang via special environment variables. This is implemented
for Lustre, in the style of:

https://www.kernel.org/doc/html/latest/kbuild/llvm.html

Instances in which GCC is explicitly called are replaced by the
use of $CC. The proper environment variables as passed to make
invocations as needed.

All checks which influence global compiler and toolchain settings
are collected in 'config/lustre-toolchain.m4'.

A configure option is added to disable the strict error flags that
are passes to the C compiler by default. CFLAGS and EXTRA_CFLAGS
are made to work in the typical way. Having fine grained control
over compiler options makes experimenting with Clang smoother.

Some compile checks in 'lustre-core.m4' have been improved by using
unused variables and explicitly setting the compile flag to be used
during the test.

This also sets the execute bit on autogen.sh.

Tested with:
Linux (mainline) - 5.15.94
openZFS - 2.1.99
Lustre (latest master) - 2.15.55
CentOS - 8.5
Clang (default on CentOS) - 12.0.1

Lustre-change: https://review.whamcloud.com/50063
Lustre-commit: 7f1aa5b66b247f339a9e7c25415a9a5dd272763c

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Ia8654c22fa8fca7bfb96c545ac144a1d3737fa00
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54054
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-10994 clio: remove cpo_assume, cpo_unassume, cpo_fini
John L. Hammond [Wed, 14 Feb 2024 21:07:19 +0000 (13:07 -0800)]
LU-10994 clio: remove cpo_assume, cpo_unassume, cpo_fini

Remove the cl_page methods cpo_assume, cpo_unassume, and
cpo_fini. These methods were only implemented by the vvp layer and so
they can be easily inlined into cl_page_assume() and
cl_page_unassume().

Lustre-change: https://review.whamcloud.com/47373
Lustre-commit: 9045894fe0f5033334a39a35a6332dab4498e21e

LU-6142 clio: make cp_ref in cl_page a refcount_t

As this is used as a refcount, it should be declared
as one.

Lustre-change: https://review.whamcloud.com/49072
Lustre-commit: e19804a3b7e793a11b1c8b5e0db9f6315f243b8c

Signed-off-by: Mr. NeilBrown <neilb@suse.de>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Change-Id: I260c5593983bac6742cf7577c26a4903e95ceb7c
Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54037
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-10994 clio: remove cpo_own and cpo_disown
John L. Hammond [Wed, 14 Feb 2024 09:17:56 +0000 (01:17 -0800)]
LU-10994 clio: remove cpo_own and cpo_disown

Remove the cpo_own and cpo_disown methods from struct
cl_page_operations. These methods were only implemented by the vvp
layer so they can be inlined into cl_page_own0() and
cl_page_disown(). Move most of vvp_page_discard() and all of
vvp_transient_page_discard() into cl_page_discard().

Lustre-change: https://review.whamcloud.com/47372
Lustre-commit: 81c6dc423ce4c62a64d328e49697d26194177f9f

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: I3f156d6ca3e4ea11c050b2addda38e84a84634b9
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54035
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-10994 clio: remove cl_page_export() and cl_page_is_vmlocked()
John L. Hammond [Wed, 14 Feb 2024 20:06:24 +0000 (12:06 -0800)]
LU-10994 clio: remove cl_page_export() and cl_page_is_vmlocked()

Remove cl_page_export() and cl_page_is_vmlocked(), replacing them with
direct calls to PageSetUptodate() and PageLoecked().

Lustre-change: https://review.whamcloud.com/47241
Lustre-commit: 3d52a7c5753e80e78c3b6f6bb7a0b66b37f4849b

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: I883d1664f4afc7a1d4006f9f4833db8125c0e8f5
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54034
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-10994 echo: remove client operations from echo objects
John L. Hammond [Wed, 14 Feb 2024 20:00:59 +0000 (12:00 -0800)]
LU-10994 echo: remove client operations from echo objects

Remove the client (io, page, lock) operations from echo_client
objects. This will facilitate the simplification of CLIO.

Lustre-change: https://review.whamcloud.com/47240
Lustre-commit: 6060ee55b194e37e87031c40e9d48f967eabe314

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: If9e55c7d54c171aa2e1bcf272641c2bd6be8ad48
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54046
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-10994 test: remove netdisk from obdfilter-survey
John L. Hammond [Wed, 14 Feb 2024 19:29:24 +0000 (11:29 -0800)]
LU-10994 test: remove netdisk from obdfilter-survey

Remove the netdisk case from obdfilter-survey. Remove subtests that
use echo_client over osc devices.

Lustre-change: https://review.whamcloud.com/47239
Lustre-commit: 51c491dac6aec99fc328732b4358e8d5732dc230

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: I260001241cee3027f68e62077e5817221bd0c08b
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54044
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-10994 lov: remove lov_page
John L. Hammond [Wed, 14 Feb 2024 08:57:14 +0000 (00:57 -0800)]
LU-10994 lov: remove lov_page

Remove the lov page layer since it does nothing but costs 24 bytes per
page plus pointer chases.

Lustre-change: https://review.whamcloud.com/47221
Lustre-commit: 56f520b1a4c9ae64caa235e9ce7699e7fb627f0c

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: Icd7b4b0041e0fe414a3a4143179f45845177960e
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54033
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-15477 osc: osc_extent_wait() deadlock
Andriy Skulysh [Wed, 14 Feb 2024 08:49:30 +0000 (00:49 -0800)]
LU-15477 osc: osc_extent_wait() deadlock

Thread 1:
vvp_io_write_commit
osc_io_commit_async
osc_page_cache_add
osc_extent_find
osc_extent_wait

Thread 2:
ptlrpcd_check
ptlrpc_check_set
brw_queue_work
osc_extent_make_ready
vvp_page_make_ready_start
__lock_page

We must not hold a page lock while we do osc_extent_find()

Lustre-change: https://review.whamcloud.com/46281
Lustre-commit: 821a8d7b481d34a54044dfe871e4532f0996de8a

Change-Id: Idf669bc8d9c943f28e3f5986826b9637d66ecfca
HPE-bug-id: LUS-10414
Fixes: a7299cb012 "LU-9920 vvp: dirty pages with pagevec"
Signed-off-by: Andriy Skulysh <andriy.skulysh@hpe.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54032
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-11290 osc: Batch gang_lookup cbs
Patrick Farrell [Wed, 14 Feb 2024 08:44:47 +0000 (00:44 -0800)]
LU-11290 osc: Batch gang_lookup cbs

The osc_page_gang_lookup call backs can be trivially
converted to operate in batches rather than one page at a
time.  This improves cancellation time for locks protecting
large numbers of pages by about 10% (after landing
another optimization (LU-11290 ldlm: page discard speedup)
it shows 6% for canceling a lock for 30GB cached file ).

Truncate to zero time (with one lock protecting many pages)
was improved by about 5-10% as well.  Lock weighing
performance should be improved slightly as well, but is
tricky to benchmark.

Lustre-change: https://review.whamcloud.com/33089
Lustre-commit: 0d6d0b7bc95a82dee02d35d0a8a41d24692cad45

HPE-bug-id: LUS-6432
Change-Id: Ib30594ae97182cbeb18051d6cee860c97ae7e119
Signed-off-by: Patrick Farrell <paf@cray.com>
Signed-off-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54031
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-14047 lustre: change EWOULDBLOCK to EAGAIN
John L. Hammond [Wed, 14 Feb 2024 08:39:27 +0000 (00:39 -0800)]
LU-14047 lustre: change EWOULDBLOCK to EAGAIN

On linux, EWOULDBLOCK has always been defined as an alias for
EAGAIN. In the interest of readability we should not use two names for
the same thing. So change the remaining uses of EWOULDBLOCK to EAGAIN
and add EWOULDBLOCK||EAGAIN to spelling.txt.

Lustre-change: https://review.whamcloud.com/40307
Lustre-commit: a7f48e6c15e28617793d89958c79e9ed8cb73e65

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: Ib48b8a1e58bfa961d2a4ba411c038c476bfc300d
Reviewed-by: Yingjin Qian <qian@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54030
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
15 months agoRM-620 build: New tag 2.14.0-ddn136
Andreas Dilger [Sat, 24 Feb 2024 03:53:00 +0000 (20:53 -0700)]
RM-620 build: New tag 2.14.0-ddn136

New tag 2.14.0-ddn136

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: If89f3f4d33b83da86a63b998d141793db509c013

15 months agoEX-8669 llite: set STATX_ATTR_COMPRESSED flag in ll_iocontrol()
Jian Yu [Wed, 7 Feb 2024 09:08:04 +0000 (01:08 -0800)]
EX-8669 llite: set STATX_ATTR_COMPRESSED flag in ll_iocontrol()

This patch extracts the compression flag LUSTRE_COMPR_FL from
mbo_flags and set STATX_ATTR_COMPRESSED flag in ll_iocontrol()
to please lsattr and other e2fsprogs tools.

Test-Parameters: trivial testlist=sanity-compr

Change-Id: I14d2082a6719a1ca5708f7aef7a2fb0f085ca63c
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53953
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoEX-9107 ldiskfs: sync ext4-mballoc-dense with master
Alex Zhuravlev [Wed, 31 Jan 2024 20:05:49 +0000 (23:05 +0300)]
EX-9107 ldiskfs: sync ext4-mballoc-dense with master

extend ac_flags to fit new EXT4_MB_VERY_DENSE

Fixes: f36eda6a1e ("LU-10026 osd-ldiskfs: use preallocation for dense writes")
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Id024cbca902d56728133d7d3e69d56fc355c1bc1
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53871
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
15 months agoLU-17034 quota: lqeg_arr memmory corruption
Sergey Cheremencev [Fri, 25 Aug 2023 06:22:26 +0000 (10:22 +0400)]
LU-17034 quota: lqeg_arr memmory corruption

Fix memory corruption caused by accessing memory
out of array lqeg_arr. It could happen when at least
one of OSTs has index larger than the whole number
of OSTs. For example, if the system has 4 OSTs with
indexes 0001, 0002, 00c9, 00ca. This issue more often
corrupted bucket_table in obd_uuid_hash or obd_nid_hash
causing to crash rhashtable code. However, it could
be the reason of other panics depending on the type
of corrupted neighbour memory region.

This patch adds an lge_idx field to each lqe global entry
to store index of the OST. It is needed to map OST index
to the array index to avoid out-of-bound array access.

This patch also add locking to protect lqe_glbl_data in
qmt_set_revoke and qmt_clear_lgeg_arr_nu. This was
forgotten in 50ff4d1da6.

This patch begins to store all connected MDTs in the quota
global pool. Thus handling MDTs beginning from this patch
is the same with OSTs stored in the global pool. It is the
1st step to introduce MDT pools.

Add conf-sanity_33c that reproduces mentioned memory
corruption without the fix.

Lustre-change: https://review.whamcloud.com/52094
Lustre-commit: 67f90e42889ff22d574e82cc647f6076e48c65a5

Fixes: 50ff4d1da6 ("LU-16772 quota: protect lqe_glbl_data in qmt_site_recalc_cb")
Signed-off-by: Sergey Cheremencev <scherementsev@ddn.com>
Change-Id: Id6e4bcde09d9f32726d69f711eedb82729a2266e
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53810
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
15 months agoLU-17034 revert: "quota: tmp fix against memory corruption"
Sergey Cheremencev [Thu, 18 Jan 2024 19:03:50 +0000 (22:03 +0300)]
LU-17034 revert: "quota: tmp fix against memory corruption"

This reverts commit fdcb1144c95908bbbd0216ec931ac5f222f484a7
as it was a temporary solution. Instead of that will be landed
"LU-17034 quota: lqeg_arr memmory corruption".

Signed-off-by: Sergey Cheremencev <scherementsev@ddn.com>
Change-Id: I6c057ff7e0f9c8789190c51c14fc370afe0c703c
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53809
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-17334 lmv: handle object created on newly added MDT
Lai Siyao [Thu, 7 Dec 2023 12:39:09 +0000 (07:39 -0500)]
LU-17334 lmv: handle object created on newly added MDT

When a new MDT is added to a filesystem without no_create, then a new
object is created on the MDT relatively quickly after it is added to
the filesystem, in particular because the new MDT would be preferred
by QOS space balancing due to lots of free space. However, it might
take a few seconds for the addition of the new MDT to be propagated
across all of the clients, so there is a risk that one client creates
a directory on an MDT that a client is not yet aware of, which returns
an error to the application immediately.

This patch fixes the issue by adding lmv_tgt_retry() that will retry
to use the MDT and wait for some number of seconds for the filesystem
layout to be updated if the MDT index an existing file/directory is
not found.

Commands that depend on user input, like 'lfs mkdir -i' and 'lfs df'
and round-robin MDT allocation will continue to use lmv_tgt() which
doesn't retry in case user specifies wrong MDT index, otherwise it can
hang the command for an extended period of time.

Lustre-change: https://review.whamcloud.com/53363
Lustre-commit: 94a4663db95656ade6b6e695b849cd7763f0bd49

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Idb0cf65e95f665628d6799298732b7a06cde4a86
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54018
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
15 months agoLU-17469 llite: hold object reference in IO
Bobi Jam [Mon, 22 Jan 2024 12:14:56 +0000 (20:14 +0800)]
LU-17469 llite: hold object reference in IO

There could be a race between page write and inode free, hold
a cl_object reference during the IO lest accessing freed object.

Lustre-change: https://review.whamcloud.com/53819
Lustre-commit: TBD (from a84242bc202e402664a5f5d7461b66c770896851)

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Ic70cc27430e68265aba0662fc68e9bfe2f86cfe1
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53760
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <paf0187@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoDDN-4630 sec: protect against concurrent mi_ginfo change
Sebastien Buisson [Thu, 22 Feb 2024 12:44:57 +0000 (13:44 +0100)]
DDN-4630 sec: protect against concurrent mi_ginfo change

With the INTERNAL upcall mechanism, we put in the upcall cache the
groups received from the client, by appending them to a list built
from previous requests.
An existing entry is never modified once it is marked as VALID, it is
replaced with a new one, with a larger groups list. However, the group
info associated with an entry can change when updated from NEW to
VALID. This means the number of groups can only grow from 0 (group
info not set) to the current number of collected groups.
In case of concurrent cache entry update, we need to check the group
info and start over adding the groups associated with the current
request.

Fixes: 4515e5365f ("LU-17015 build: rework upcall cache")
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Ie7088bdbfcae396602b59e2ab07fbfbbb14d96af
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54146
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-16297 ptlrpc: don't panic during reconnection
Alexander Boyko [Thu, 3 Nov 2022 11:23:20 +0000 (07:23 -0400)]
LU-16297 ptlrpc: don't panic during reconnection

ptlrpc_send_rpc() could race with ptlrpc_connect_import_locked()
in the middle of assertion check and this leads to a wrong panic.
Assertion checks

(AT_OFF || imp->imp_state != LUSTRE_IMP_FULL ||

reconnect changes import state and flags
and second part

(imp->imp_msghdr_flags & MSGHDR_AT_SUPPORT) ||
!(imp->imp_connect_data.ocd_connect_flags & OBD_CONNECT_AT)))

MSGHDR_AT_SUPPORT is disabled during client reconnection.
It is not good to use locking at this hot part, so fix changes
assertion to a report.

Lustre-change: https://review.whamcloud.com/49029
Lustre-commit: df31c4c0b39b8845911344e6fadc008bcba40bb1

HPE-bug-id: LUS-10985
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: Ifc9e413c679c3e8a4c8f4f541251bebabae41c82
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54086
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
15 months agoLU-16281 clio: append to non-existent component
Vitaly Fertman [Tue, 5 Jul 2022 21:00:58 +0000 (00:00 +0300)]
LU-16281 clio: append to non-existent component

should return an error, but it fails now with a BUG below
because @rc of lov_io_layout_at() is not checked for < 0

    stripe_width()) ASSERTION( index < lsm->lsm_entry_count ) failed:
    BUG: unable to handle kernel paging request at ffff99d3c2f74030
    Call Trace:
      lov_stripe_number+0x19/0x40 [lov]
      lov_page_init_composite+0x103/0x5f0 [lov]
      ? kmem_cache_alloc+0x12e/0x270
      cl_page_alloc+0x19f/0x660 [obdclass]
      cl_page_find+0x1a0/0x250 [obdclass]
      ll_write_begin+0x1f7/0xfb0 [lustre]

Lustre-change: https://review.whamcloud.com/48994
Lustre-commit: 8fdeca3b6faf22c72f6687aa23b86715d39ceeb1

HPE-bug-id: LUS-11075
Signed-off-by: Vitaly Fertman <vitaly.fertman@hpe.com>
Change-Id: I4371f56cd9cdb3429d52a283831fb0a768e5c9c3
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54133
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Zhenyu Xu <bobijam@hotmail.com>
15 months agoLU-14441 mdc: check/grab import before access
Alex Zhuravlev [Mon, 13 Dec 2021 08:27:42 +0000 (11:27 +0300)]
LU-14441 mdc: check/grab import before access

to ensure the import doesn't disappear while being accessed
via procfs.

Lustre-change: https://review.whamcloud.com/41681
Lustre-commit: b8416320b381ae8a6fdd058b0a09ea42ce56d573

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I005c96b349e55646996fd0d265ab4dd1e2b9a1fa
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54126
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-17484 gss: reply error for SEC_CTX_INIT on wrong node
Sebastien Buisson [Thu, 8 Feb 2024 12:44:21 +0000 (13:44 +0100)]
LU-17484 gss: reply error for SEC_CTX_INIT on wrong node

When a server receives a SEC_CTX_INIT request for a target that is not
available (either stopping, or not set up yet, or moved to a failover
node), the request gets dropped. This makes the client-side RPC time
out, increasing the time it takes to establish a proper gss context
with the target, because it slows down the HA mechanism that tries
alternate failover NIDs.
Instead of dropping the request reply for SEC_CTX_INIT, the server
needs to send back a proper error reply. The client will then be able
to immediately try alternate failover NIDs, speeding mount/reconnect
process up, and avoiding potential eviction.

Lustre-change: https://review.whamcloud.com/53970
Lustre-commit: 3d635dd3f24421c181aca5673cd81ed8f3e2c622

Test-Parameters: trivial
Test-Parameters: kerberos=true testlist=sanity-krb5
Test-Parameters: testgroup=review-dne-selinux-ssk-part-2
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Id2cefaa7d54729a63c7be13b65d7ace579bcaa78
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54157
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
15 months agoLU-17528 gss: cleanup gss api usage
Sebastien Buisson [Thu, 15 Feb 2024 08:58:16 +0000 (09:58 +0100)]
LU-17528 gss: cleanup gss api usage

The lucid context support has been available from at least
krb5 1.7, and even RHEL7 ships with a more recent version.
So drop support for non-lucid api, and cleanup gss api usage.

Lustre-change: https://review.whamcloud.com/54063
Lustre-commit: 79a2d8645a28de77c7406ba56889d3a0749b851c

Test-Parameters: trivial
Test-Parameters: kerberos=true testlist=sanity-krb5
Test-Parameters: testgroup=review-dne-selinux-ssk-part-2
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I91fb706d2444c199156423b57a8c1ef24a0c3420
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54156
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
15 months agoLU-17535 gss: fix lsvcgssd crash in krb lib
Bruno Faccini [Tue, 13 Feb 2024 11:14:40 +0000 (12:14 +0100)]
LU-17535 gss: fix lsvcgssd crash in krb lib

This patch fixes some logic around the need to call
gss_delete_sec_context() or not vs kerberos implementations.

snd->ctx address instead of value should be passed to
serialize_context_for_kernel()/serialize_krb5_ctx() to
allow each implementation to clear it with GSS_C_NO_CONTEXT
if it has been destroyed internally, and cases where not
can also be handled in handle_krb() now.

Lustre-change: https://review.whamcloud.com/54023
Lustre-commit: f2705c4ec5598ca244bbb08673a1cfefd7342812

Test-Parameters: trivial
Test-Parameters: kerberos=true testlist=sanity-krb5
Test-Parameters: testgroup=review-dne-selinux-ssk-part-2
Signed-off-by: Bruno Faccini <bfaccini@nvidia.com>
Change-Id: I752712168a2c0f0a5a7a496b851d4cddbb7e4236
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54155
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
15 months agoLU-17226 build: create config option for l_getsepol
Gian-Carlo DeFazio [Thu, 16 Nov 2023 23:05:45 +0000 (15:05 -0800)]
LU-17226 build: create config option for l_getsepol

Add a configuration option for l_getsepol.
l_getsepol is build by default unless the --disable-l_getsepol
option is given to configure.
lustre.spec.in builds l_getsepol by default and has its
dependencies as build requirements.

The implicit configuration check for the dependency
openssl-devel is removed and replaced by a BuildRequires.

Lustre-change: https://review.whamcloud.com/52849
Lustre-commit: 2777adcabd1032ddb886f913fa04d82a292ab379

Test-Parameters: trivial
Signed-off-by: Gian-Carlo DeFazio <defazio1@llnl.gov>
Change-Id: If71a2a4a524047edbd2b31e6fac7a42f36a030bf
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54162
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoEX-9074 csdc: Provide finer grained enable_compression control
Artem Blagodarenko [Fri, 16 Feb 2024 16:50:08 +0000 (16:50 +0000)]
EX-9074 csdc: Provide finer grained enable_compression control

On all architectures other than aarch64 and ppc64le enable_compression
is now enabled by default. lfs warning message is gone.

To use CSDC on aarch64/ppc64le (on your own risk)
llite.*.enable_compression=1 should be set. lfs
set_stripe command still prints a warning message in this case.

Signed-off-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Test-Parameters: trivial
Change-Id: Ic8edc5bbeb8f9a3cd34ad3fc4e8c78e59f4cc34f
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53894
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Colin Faber <cfaber@ddn.com>
Reviewed-by: Patrick Farrell <paf0187@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoEX-8993 ofd: do not write 'hole' pages on compression
Patrick Farrell [Tue, 13 Feb 2024 16:45:17 +0000 (11:45 -0500)]
EX-8993 ofd: do not write 'hole' pages on compression

When doing unaligned read-modify-write to a compressed file,
we must round the IO lnb used for write in order to read up
the compressed data for modification.

In some cases, this creates a situation where there are
pages in the write lnb which have no data in them.  It is
important not to write out these pages, because if we do,
this wastes space and can cause incorrect file size.

In most cases, the file size is covered by the client
sending the file size, but if the client does not compress
a particular write, it does not send the size and the server
does not use it.  We could resolve this by having the client
always send size info and have the server always use it, but
it's better to make server writes 'hole' aware, since this
improves space usage.  (And this will be required for the
server to do recompression on read-modify-write, otherwise
no space is gained.)

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I66169e205fe4691ed03b2c9b3005ffc4ecd3213d
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53595
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-17258 socklnd: stop connecting on too many retries
Serguei Smirnov [Wed, 7 Feb 2024 18:48:08 +0000 (10:48 -0800)]
LU-17258 socklnd: stop connecting on too many retries

If peer repeatedly rejects connection requests with EALREADY,
assume that it doesn't support as many connections as we're trying
to create. Make sure to stop connecting to the peer altogether and
either continue with already created connections if there's at least
one of each type, or fail.

This helps avoid the assertion:

"ASSERTION( (wanted & ((((1UL))) << (3))) != 0 ) failed"

Lustre-change: https://review.whamcloud.com/53955
Lustre-commit: 02caf7170762d97dac4f367651addc7d90b6eb32

Test-Parameters: trivial testlist=sanity-lnet
Fixes: 5afe3b053 ("LU-17258 socklnd: ensure connection type established upon race")
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: I6072e91cc36544fc2f56c91cd78f6637cf82ecbc
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54014
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-17505 socklnd: return NETWORK_TIMEOUT to LNet on ETIMEOUT
Serguei Smirnov [Mon, 5 Feb 2024 23:27:15 +0000 (15:27 -0800)]
LU-17505 socklnd: return NETWORK_TIMEOUT to LNet on ETIMEOUT

Returning LNET_MSG_STATUS_LOCAL_TIMEOUT to LNet on ETIMEDOUT
causes LNet to only decrement the local NI health score,
while the issue may actually be with the remote NI.

Changing this to return LNET_MSG_STATUS_NETWORK_TIMEOUT
causes LNet to decrement both local NI and peer NI health.
If local NI is ok, it will recover its health score quickly,
but the affected peer NI health is lowered until peer NI is recovered.
This helps LNet select healthy NIs of the same peer in the meantime.

Lustre-change: https://review.whamcloud.com/53930
Lustre-commit: 099350d6e30218eb68d31cbfc7e9252a112e591f

Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: I916772477d1fd63571447262880a33830746f002
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53964
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-16752 test: improve sanity 413a/b reliability
Lai Siyao [Thu, 22 Feb 2024 18:46:12 +0000 (13:46 -0500)]
LU-16752 test: improve sanity 413a/b reliability

Set qos_maxage to 1 early in test_qos_mkdir() to ensure statfs are
updated in round-robin mkdir test, so that the subsequent QoS mkdir
behave as expected.

Lustre-change: https://review.whamcloud.com/54168
Lustre-commit: TBD (from f22e115c6a468452d4beb40c6530f4cc0627022b)

Test-Parameters: trivial
Test-Parameters: mdscount=2 mdtcount=4 testlist=sanity
Test-Parameters: mdscount=2 mdtcount=4 testlist=sanity
Test-Parameters: mdscount=2 mdtcount=4 testlist=sanity
Test-Parameters: mdscount=2 mdtcount=4 testlist=sanity
Test-Parameters: mdscount=2 mdtcount=4 testlist=sanity
Test-Parameters: mdscount=2 mdtcount=4 testlist=sanity
Test-Parameters: mdscount=2 mdtcount=4 testlist=sanity
Test-Parameters: mdscount=2 mdtcount=4 testlist=sanity
Test-Parameters: mdscount=2 mdtcount=4 testlist=sanity
Test-Parameters: mdscount=2 mdtcount=4 testlist=sanity
Fixes: 233344d451 ("LU-13417 test: generate uneven MDTs early for sanity 413")
Fixes: c1d0a355a6 ("LU-12624 lod: alloc dir stripes by QoS")
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I08f94b5b4e355ffff0704bd0f661bb99a82a9234
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54164
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
16 months agoRM-620 build: New tag 2.14.0-ddn135
Andreas Dilger [Wed, 14 Feb 2024 19:22:38 +0000 (12:22 -0700)]
RM-620 build: New tag 2.14.0-ddn135

New tag 2.14.0-ddn135

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I61d16abd2cf5185d1e05b2e67fd5404adb9f23c9

16 months agoRM-620 build: New tag lipe-2.42
Andreas Dilger [Wed, 14 Feb 2024 19:22:15 +0000 (12:22 -0700)]
RM-620 build: New tag lipe-2.42

New tag lipe-2.42

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Iaff6874778b27e6fa4a7be1f85fa0d47636190d6

16 months agoEX-9156 lipe: Print host name in SSH errors
Alexandre Ioffe [Thu, 8 Feb 2024 04:05:07 +0000 (20:05 -0800)]
EX-9156 lipe: Print host name in SSH errors

Improvement: Add host name in error messages

Test-Parameters: trivial testlist=hot-pools
Signed-off-by: Alexandre Ioffe <aioffe@ddn.com>
Change-Id: I8f552d34d0445ab35d9b978b13b3989411f95cdb
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53966
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
16 months agoDDN-4630 sec: cleanup grouplist alloc for INTERNAL id upcall
Sebastien Buisson [Thu, 8 Feb 2024 17:17:52 +0000 (18:17 +0100)]
DDN-4630 sec: cleanup grouplist alloc for INTERNAL id upcall

With the INTERNAL identity upcall, we are using supplementary groups
provided by the client, by building an array of gid_t.
Cleanup this group list allocation, and make sure the size returned
matches the actual size of the allocated array.

Fixes: 4515e5365f ("LU-17015 build: rework upcall cache")
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I72cdfc6b76bfd9c2832a5d5e5f72c3aa45cf1efe
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53979
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
16 months agoEX-8927 csdc: check ocd flag for fiemap/lseek
Bobi Jam [Tue, 6 Feb 2024 09:13:27 +0000 (17:13 +0800)]
EX-8927 csdc: check ocd flag for fiemap/lseek

Currently client will not sending fiemap/lseek request to OST if
the file is a compressed one. This patch will check the
OBD_CONNECT2_COMPRESS flags and send the request if OST supports
compression as server would do the fiemap/lseek check instead.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I478cbf161044165fa31d4caa2336e9949fc626fe
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53935
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
16 months agoEX-8927 csdc: server reject FIEMAP/SEEK_HOLE|DATA on compr obj
Bobi Jam [Fri, 2 Feb 2024 07:15:30 +0000 (15:15 +0800)]
EX-8927 csdc: server reject FIEMAP/SEEK_HOLE|DATA on compr obj

Server return -EOPNOTSUPP if they get the FIEMAP and
SEEK_HOLE/SEEK_DATA requests upon compressed file objects.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I9f04fbb13a22cc83402d9989daab63a59367ff33
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53886
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
16 months agoLU-14111 obdclass: count eviction per obd_device
Aurelien Degremont [Tue, 13 Oct 2020 14:12:23 +0000 (14:12 +0000)]
LU-14111 obdclass: count eviction per obd_device

Add a new 'obd_eviction_count' counter to obd_device which
is increased every time a client is evicted, which means
every time we call `class_fail_export()`.

Expose this counter through `lctl get_param *.*.eviction_count`
for every target.

Only support recovery-small test 146 for 2.14.0.133+.

Lustre-change: https://review.whamcloud.com/40528
Lustre-commit: 3c69d46e1766480c0ffd1bef840b4e167b4cf88e

Lustre-change: https://review.whamcloud.com/52098
Lustre-commit: b034dd27dd39483e40f91ea82d3f5c62b514ec54

Signed-off-by: Aurelien Degremont <degremoa@amazon.com>
Change-Id: I83b691662285cf2cd937187bffa54de6bd1f694c
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53897
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
16 months agoLU-17173 gss: user keys go to user keyring
Sebastien Buisson [Fri, 20 Oct 2023 08:27:14 +0000 (10:27 +0200)]
LU-17173 gss: user keys go to user keyring

Keys for root, that are used for Lustre internal processing, are
stored in the session keyring. That way they can be found by all
Lustre processes in userspace and in the kernel.
For end user keys, it is better to store them in the user keyring.
This simplifies key management, makes them shared accross all user
sessions, and avoids unfortunate key leak if lfs flushctx is not
called at user logout.

Lustre-change: https://review.whamcloud.com/52771
Lustre-commit: 02b456e4a445b9503b044df30932cc0fb5021f49

Test-Parameters: kerberos=true testlist=sanity-krb5
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Ibb3d326e89dcacc89e77eca76cdb773861d3a8a7
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53908
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
16 months agoLU-17173 utils: cleanup lfs flushctx
Sebastien Buisson [Mon, 13 Nov 2023 10:02:24 +0000 (11:02 +0100)]
LU-17173 utils: cleanup lfs flushctx

When lfs flushctx is called without mount points, build the list of
all mounts first, and then call the ioctl to flush associated
contexts. Otherwise fetching the mount points unfortunately refreshes
the contexts being flushed, because the mount points are being
accessed.

Lustre-change: https://review.whamcloud.com/52604
Lustre-commit: f0534544e3e3aef280ccc5f042e37d42d33b28d3

Test-Parameters: trivial
Test-Parameters: kerberos=true testlist=sanity-krb5
Test-Parameters: testgroup=review-dne-selinux-ssk-part-2
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I75b9efe4c65ce66f5f692f9e49a28fde705d0140
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53909
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
16 months agoLU-17173 tests: fix security related tests
Sebastien Buisson [Mon, 13 Nov 2023 10:03:38 +0000 (11:03 +0100)]
LU-17173 tests: fix security related tests

Several cleanups required in security related tests.

In sanity-krb5, in order to get proper access to keyrings, use su -
instead of runas to initialize process more completely.
Also fix use of 'lfs flushctx', as some tests do not call it properly.
And in test_8, avoid waiting arbitrarily and change fail_loc to just
sleep once.

In sanity-krb5 and sanity-sec, fix parameters passed to
start_gss_daemons().

Lustre-change: https://review.whamcloud.com/53012
Lustre-commit: 9fc12ca7f29bd70be19471c2b9143d50d2e24eda

Test-Parameters: trivial
Test-Parameters: kerberos=true testlist=sanity-krb5
Test-Parameters: testgroup=review-dne-selinux-ssk-part-2
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I4598ae5a7d28afbc39d7cc2d0afd1096d877d03b
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53910
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
16 months agoLU-17016 mdd: no EXDEV for parent dir projid mismatch
Andreas Dilger [Fri, 4 Aug 2023 05:01:42 +0000 (23:01 -0600)]
LU-17016 mdd: no EXDEV for parent dir projid mismatch

Don't return EXDEV if the parent directory projid of a renamed
directory does not match the projid of the target dir.  Only the
projid of the source directory itself and the target matter.

Rename variables in mdd_rename_sanity_check() and mdd_rename()
so the object and attribute variable names are consistent.

Improve console error messages to contain more useful information.
Replace spaces with tabs in affected functions.

Lustre-change: https://review.whamcloud.com/51868
Lustre-commit: 1c033467317394d18a7aa05f6e81734bcbbcac75

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I7aa53f6d168926719ad9fd5df3c760e6c73ebbe5
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53965
Tested-by: jenkins <devops@whamcloud.com>
16 months agoEX-9125 tests: ignore sanity-compr/1008 failures
Andreas Dilger [Tue, 13 Feb 2024 04:01:20 +0000 (21:01 -0700)]
EX-9125 tests: ignore sanity-compr/1008 failures

Temporarily ignore test failures for sanity-compr.sh test_1008 on
SLES15 and aarch64/ppc64le due to very high failure rates on those
systems.

Rename the *second* test_1008 to test_1080 so it can run, and allow
other new tests to start using disjoint numbers to avoid conflicts.

Test-Parameters: trivial testlist=sanity-compr.sh env=ONLY="1008 1080"
Fixes: bd462ce8e4 ("EX-7795 tests: add sanity-compr test for dir compression")
Fixes: 7546ae79e9 ("EX-8688 csdc: Add header checksum verification calculation")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I199613b1e5b1ee8ea7ca4287a6dfe090257ed72f
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54019
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
16 months agoRM-620 build: New tag 2.14.0-ddn134
Andreas Dilger [Thu, 8 Feb 2024 09:00:30 +0000 (02:00 -0700)]
RM-620 build: New tag 2.14.0-ddn134

New tag 2.14.0-ddn134

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I40aa3ac64522236007b800fda7edcd22255a7a4f

16 months agoRM-620 build: New tag lipe-2.41
Andreas Dilger [Thu, 8 Feb 2024 08:59:59 +0000 (01:59 -0700)]
RM-620 build: New tag lipe-2.41

New tag lipe-2.41

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I8bcb7322f9be8c0c4bd6dd7a6f47826044e27801

16 months agoLU-17422 osc: Clear PageChecked on bounce pages
Patrick Farrell [Tue, 30 Jan 2024 21:07:59 +0000 (16:07 -0500)]
LU-17422 osc: Clear PageChecked on bounce pages

When we're finalizing a bounce page, we must clear
PageChecked.  Otherwise, if it's a page pool page, it will
be reused without the full wipe the kernel gives it, and we
will see PageChecked on pages which are not actually from
encryption and will handle them incorrectly.

Lustre-Change: https://review.whamcloud.com/53865/
Lustre-Commit: TBD  (from 5582abc557a8d7188bbb6fb2bc38585338f660b4)

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I8b319e7ba55dd883d74db79a19bf93b6f125616a
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53866
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
16 months agoEX-8257 lipe: Hot pools: Add compression support
Alexandre Ioffe [Thu, 18 Jan 2024 23:36:22 +0000 (15:36 -0800)]
EX-8257 lipe: Hot pools: Add compression support

Add compression type and compression level
to lamigo command line options for slow pool.
Check compression option availability in 'lfs mirror extend'
and use it when replicate file to slow pool.

Test-Parameters: trivial testlist=hot-pools
Signed-off-by: Alexandre Ioffe <aioffe@ddn.com>
Change-Id: I207a92079d98bfbffd3a08295527fbb7fca03045
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53753
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
16 months agoEX-8688 csdc: Add header checksum verification and calculation
Artem Blagodarenko [Wed, 20 Dec 2023 22:53:26 +0000 (22:53 +0000)]
EX-8688 csdc: Add header checksum verification and calculation

This commit adds functionality to verify and calculate the checksum
of the header in the `ll_compr_hdr` struct and some another sanity
check.

Signed-off-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Change-Id: I24a9ab9cb7bea1208ada23aa6550127fe6a55017
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53521
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
16 months agoEX-8814 csdc: Update async_args after resend
Artem Blagodarenko [Thu, 25 Jan 2024 23:01:05 +0000 (23:01 +0000)]
EX-8814 csdc: Update async_args after resend

It is decided to send an uncompressed request on redo.
osc_brw_prep_request() processes uncompressed data and prepares
a request, so some parts of the old request are outdated.

Let's update the old request with information from the new one.

Fixes: 8fb8d5b ("EX-8814 csdc: Revert "EX-8189 osc: do not compress resends")
Signed-off-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Change-Id: Idb1c6ee9db64cb1f2ea1c1562b1c5aae443263e3
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53830
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
16 months agoLU-17000 utils: In mydaemon() check after calling open()
Arshad Hussain [Mon, 22 Jan 2024 10:33:02 +0000 (16:03 +0530)]
LU-17000 utils: In mydaemon() check after calling open()

This patch adds check after calling open() in function
mydaemon() instead of directly using the value

Lustre-change: https://review.whamcloud.com/53758
Lustre-commit: 0f67ab9b00c3949f257cd4e6081184858f245b4e

Test-Parameters: trivial kerberos=true testlist=sanity-krb5
CoverityID: 397666 ("Argument cannot be negative")
Fixes: d2d56f38da0 ("make HEAD from b_post_cmd3")
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: Ic59414977029221e8618c5bb3320e95d39d9cded
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53911
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
16 months agoDDN-4656 osd-ldiskfs: hide alloc time in brw_stats
Andreas Dilger [Sun, 4 Feb 2024 00:44:13 +0000 (17:44 -0700)]
DDN-4656 osd-ldiskfs: hide alloc time in brw_stats

For EXA6.0/6.1 do not show the "block maps msec" stats in brw_stats
by default as this breaks collectd and lustrefs_collector parsing.
Base this check on the Linux kernel version, since those releases
were based on RHEL7.9 on the server, while EXA6.2/6.3 use RHEL8.

Add an "enable_brw_block_maps" parameter that can be used to
disable the display of this statistic (it is always collected).

Enable the "enable_stats_header" parameter automatically in the
same way, as this was added for EXA6.2 but should now be supported.

Test-Parameters: trivial
Fixes: c1e43cf8e0 ("LU-15564 osd: add allocation time histogram")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ib5e33bd98085aaf4a5a5d39283d5d334b93ebbe5
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53903
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
16 months agoLU-17476 tests: wait for sanity/350 to clean up
Andreas Dilger [Mon, 5 Feb 2024 22:27:20 +0000 (15:27 -0700)]
LU-17476 tests: wait for sanity/350 to clean up

Wait until sanity test_350 has finished deleting its files before
moving on to the next subtests, otherwise the background cleanup
can cause later test failures (in particular test_413a).

Test-Parameters: trivial testlist=sanity
Test-Parameters: testlist=sanity
Test-Parameters: testlist=sanity
Test-Parameters: testlist=sanity
Test-Parameters: testlist=sanity
Fixes: d1509ff2ca ("LU-17476 lnet: prefer to use bits only to match ME")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I9ff61013764f4e47916999eefab893e069bb217a
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53928
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
16 months agoEX-7717 lipe: Add simple compression ratio statistics
Vitaliy Kuznetsov [Tue, 12 Dec 2023 14:49:27 +0000 (15:49 +0100)]
EX-7717 lipe: Add simple compression ratio statistics

This patch adds a new table to display data
compression ratio in overall statistics.

The new table to display compression ratio (for regular files)
will have the following column values:
0. Compression ratio range;
1. Count of files in range;
2. Number of files in range as a percent of total
   number of files;
3. Number of files in this range or smaller as
   a % of total # of files;
4. Total compression size of files in range;
5. Total compression size of files in range as a % of
   total compression size of files;
6. Total compression size of files in this range or
   smaller as a % of total compression size of files;
7. Minimum value in range (ratio);
8. Maximum value in range (ratio).

The columns in the table are numbered from 0 to 8 for a better
understanding of the table without the need to name the
columns with long text.

This PR also changes some variable types to the "double" type
for correct calculation of values and to avoid duplication of
variables with the same semantic value.

The output of information in reports with the .out
extension has also been improved.

Test-Parameters: trivial testlist=sanity-lipe-scan3
Signed-off-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Change-Id: I242ddb9c4132a7fce81508dadacf8e2b01e3cead
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52372
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Colin Faber <cfaber@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
16 months agoEX-8038 csdc: store compression info in FID EA
Bobi Jam [Thu, 11 Jan 2024 09:44:18 +0000 (17:44 +0800)]
EX-8038 csdc: store compression info in FID EA

Store compression information in OST-object's FID EA, and lfsck could
use it to recover the MDT-object layout EA from orphan OST-object(s).

2.15 Lustre may embed PFID and layout stripe info in LMA EA, this
patch would clear them from LMA EA and store them with compression
info directly into FID EA thereafter.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Iacac04601b73f85d9bc057b8dd34a5004248dac4
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53649
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
16 months agoEX-8038 csdc: expand filter_fid
Bobi Jam [Fri, 4 Aug 2023 07:02:41 +0000 (15:02 +0800)]
EX-8038 csdc: expand filter_fid

Expand filter_fid to include compression information, for
compatibility reason, if the file is an uncompressed file, still
store the old filter_fid with no compression info in FID EA.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I388500c03604749d05849aeed3c9141974540e4a
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53663
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
16 months agoRM-620 build: New tag 2.14.0-ddn133
Andreas Dilger [Fri, 2 Feb 2024 16:21:29 +0000 (09:21 -0700)]
RM-620 build: New tag 2.14.0-ddn133

New tag 2.14.0-ddn133

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I06785ba295c668f8aab7dcbf2504c64068592123

16 months agoRM-620 build: New tag lipe-2.40
Andreas Dilger [Fri, 2 Feb 2024 16:21:16 +0000 (09:21 -0700)]
RM-620 build: New tag lipe-2.40

New tag lipe-2.40

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: If7cd3a8cb392eee47bbc004ba881518f0e3fb991

16 months agoLU-17482 llite: short read could mess up next read offset
Bobi Jam [Fri, 26 Jan 2024 10:14:36 +0000 (18:14 +0800)]
LU-17482 llite: short read could mess up next read offset

When read reaches EOF, it could read data from stale pagecache, but
we need to restore the iocb->ki_pos so that next read could continue
from the correct offset.

Lustre-change: https://review.whamcloud.com/53827
Lustre-commit: TBD (from 4bec3a277c83932cfb5ba26e31336e1f4666460a)

Fixes: 4468f6c9d9 ("LU-16025 llite: adjust read count as file got truncated")
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Ib8b62c41bf65f8efec82dda53fcfbdb68ad08b38
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53828
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
16 months agoLU-17476 lnet: prefer to use bits only to match ME
Serguei Smirnov [Sat, 27 Jan 2024 20:17:34 +0000 (12:17 -0800)]
LU-17476 lnet: prefer to use bits only to match ME

In some cases, it has been observed that a reply will arrive
at the portal with the correct match bits, but is dropped by
lnet_parse_put().  This appears to happen with LNet Multi-Rail
peers, each having two separate NIDs.

If a reply arrives with matchbits available and matching, but
the NIDs don't match, confirm the match if the NIDs are found
to belong to the same peer.  This will only happen in cases
where the reply would be dropped entirely, causing hundreds of
seconds of delay until the RPC is resent, so the extra overhead
of checking for a peer match before dropping the request is
only in the error path and minimal compared to the alternative.

Add CFS_FAIL_CHECK() for exercising the match NIDs code.

That is in a hot codepath, but CFS_FAIL_CHECK() is marked unlikely()
and this check is in the error case and _should_ only be hit when the
message would have been dropped anyway, so it seems unlikely to impact
performance in any meaningful way.

Lustre-change: https://review.whamcloud.com/53843
Lustre-commit: TBD (from 3360e892750d1bf4f2b7ceab60d9a637b3e649ad)

Test-Parameters: testlist=sanity-lnet env=ONLY=350,ONLY_REPEAT=10
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I10e1a2142539ddf5dabc26ce962cec1f2cfcf3db
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53846
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
16 months agoLU-16873 osd: update OI_Scrub file with new magic
Alexander Zarochentsev [Sun, 28 May 2023 12:42:27 +0000 (08:42 -0400)]
LU-16873 osd: update OI_Scrub file with new magic

The fix for LUS-11542 detects the format change correctly
but does not write new oi scrub file magic, so new mount
triggers the "oi files counter reset" again and again.

Lustre-change: https://review.whamcloud.com/51226
Lustre-commit: 38b7c408212f60d684c9b114d90b4514e0044ffe

Fixes: 126275ba83 ("LU-16655 scrub: upgrade scrub_file from 2.12 format")
HPE-bug-id: LUS-11646
Signed-off-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Change-Id: Ia13fcfaf0d8f2c4ee9331dd9fec0ff159d195186
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53854
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
16 months agoEX-8598 tests: use alternative data source for rewriting
Artem Blagodarenko [Sun, 28 Jan 2024 20:24:31 +0000 (20:24 +0000)]
EX-8598 tests: use alternative data source for rewriting

Using the same file as input has disadvantages. It is not
possible to understand that data was not rewritten at all.
Alternative data source should be used.

Let's shift source file data and use it as a source.
To check rewriting result the same operarion is performed
on the destination file copy stored outside the Lustre FS.

Signed-off-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Test-Parameters: trivial testlist=sanity-compr env=ONLY=1004
Change-Id: I6ef400520359bfe9156c3f47e757064863bdf4e0
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53088
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
16 months agoEX-8996 ofd: handle 'missing object' reads
Patrick Farrell [Wed, 24 Jan 2024 16:02:32 +0000 (11:02 -0500)]
EX-8996 ofd: handle 'missing object' reads

When the read code (eg, mdt_preprw_read) finds there is no
object, it will return a read with 0 pages, but not fail the
read.  The assert for local and remote pages needs to
recognize this case.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Idc6ff70f71abc100f750a63eca73a754a56f6435
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53807
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
16 months agoEX-8450 tests: skip sanity-lipe-find3/306 on el7.9
Andreas Dilger [Mon, 29 Jan 2024 23:20:02 +0000 (16:20 -0700)]
EX-8450 tests: skip sanity-lipe-find3/306 on el7.9

The sanity-lipe-scan3 test_309 is failing consistently with el7.9
*clients*.  Exclude it until fixed or we drop this client version.

Test-Parameters: trivial testlist=sanity-lipe-find3 clientdistro=el7.9 serverdistro=el7.9
Test-Parameters: trivial testlist=sanity-lipe-find3 clientdistro=el8.8 serverdistro=el7.9
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Iceca83a3b85df95fe45482076170d77a6abc0947
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53853
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Reviewed-by: Colin Faber <cfaber@ddn.com>
16 months agoEX-7601 tests: skip sanity-compr tests in interop
Andreas Dilger [Mon, 29 Jan 2024 21:54:03 +0000 (14:54 -0700)]
EX-7601 tests: skip sanity-compr tests in interop

Skip a number of subtests in sanity-compr that depend on fixes
landed to the code that were not available in older versions.

Test-Parameters: trivial testlist=sanity-compr serverversion=EXA6.3.0
Fixes: 3e1dd9d6ae ("LU-17468 lod: component add missed pattern info")
Fixes: 7731c7fc74 ("EX-7601 tests: unaligned read tests")
Fixes: 033dd0ba2c ("EX-7644 mmap: add mmap support for compression")
Fixes: 46708e4636 ("EX-7601 tests: tests for read-modify-write")
Fixes: 6c4c4d7599 ("EX-7601 tests: add multi-mount compression test")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I26cae5cf01cc32c9f3e4386cf7151a66ac3678ea
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53852
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
16 months agoEX-7795 tests: add sanity-compr test for dir compression
Jian Yu [Tue, 30 Jan 2024 02:26:24 +0000 (18:26 -0800)]
EX-7795 tests: add sanity-compr test for dir compression

This patch adds a sanity-compr test to validate that
we get directory space usage reduction with compression.

Change-Id: I16f3a3f1e413e4884b3973829df36500667271ce
Test-Parameters: trivial testlist=sanity-compr env=ONLY="1007 1008",ONLY_REPEAT=3
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53855
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Colin Faber <cfaber@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
16 months agoLU-16367 utils: clean up ldiskfs feature handling
Andreas Dilger [Mon, 5 Dec 2022 18:59:02 +0000 (11:59 -0700)]
LU-16367 utils: clean up ldiskfs feature handling

Update the default ldiskfs features used by mkfs.lustre:
- enable large_dir on OSTs as well as MDTs
- remove obsolete handling of "ext3" filesystems
- clean up handling of other features that have become a bit messy

Lustre-change: https://review.whamcloud.com/49316
Lustre-commit: e6b6b7ee253cedd8aeb6bb48d6c54916368c4109

Test-Parameters: trivial testlist=conf-sanity
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Id717c3ba939ccf9b2de34e868d4415e88429ef39
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53875
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
16 months agoLU-16599 obdclass: job_stats can parse escaped jobid string
Lei Feng [Wed, 1 Mar 2023 00:16:03 +0000 (08:16 +0800)]
LU-16599 obdclass: job_stats can parse escaped jobid string

Writing a jobid to job_stats proc entry asks lustre to clear
the stats of the specific jobid. Since job_stats outputs
escaped jobid string in some cases, it should be able to parse
an escaped jobid string when the string is written to it.

Lustre-change: https://review.whamcloud.com/50160
Lustre-commit: 8f004bc53b1a488dad5a92a580f5f0c078e33654

Test-Parameters: trivial
Signed-off-by: Lei Feng <flei@whamcloud.com>
Change-Id: Idbc63dac6c3b35331317927107e634a3d638dd66
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53847
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
16 months agoLU-14810 lnet: Cancel discovery ping/push on shutdown
Chris Horn [Tue, 5 Dec 2023 09:56:57 +0000 (03:56 -0600)]
LU-14810 lnet: Cancel discovery ping/push on shutdown

Discovery shutdown can race with ping and push events. In some cases
this can result in failing to unlink ping/push MDs on shutdown.
Protect against this by checking for PING/PUSH_FAILED state on peers
on the request queue.

Lustre-change: https://review.whamcloud.com/53356
Lustre-commit: c3b9597742d5118a96f56129e7dd30d84468d2c8

Test-Parameters: trivial
Test-Parameters: testlist=sanity-lnet env=ONLY=500,ONLY_REPEAT=50
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I84a1f5beb6508651bc62e1dd93271f9e72f5081c
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53848
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
16 months agoLU-17471 osd: add symlink for brw_stats
Hongchao Zhang [Fri, 26 Jan 2024 13:43:36 +0000 (21:43 +0800)]
LU-17471 osd: add symlink for brw_stats

Add symlink at /proc/fs/lustre/osd-*/*/brw_stats to
/sys/kernel/debug/lustre/osd-*/*/brw_stats to fix
the compatible issue of the previous utils that are
still using the old proc entry.

Lustre-change: https://review.whamcloud.com/53829
Lustre-commit: TBD (from 5fad20603098c55c0080548a177023a36e640e84)

Fixes: 8a84c7f9c7 ("LU-14927 osd: share brw_stats code between OSD back ends")
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Change-Id: Ie86b2b384e3b91f98ead00b6325ddeb020e47aa5
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53858
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
16 months agoRM-620 build: New tag 2.14.0-ddn132
Andreas Dilger [Mon, 29 Jan 2024 09:02:19 +0000 (02:02 -0700)]
RM-620 build: New tag 2.14.0-ddn132

New tag 2.14.0-ddn132

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I65b4833ced4c7c398110c49336138d5fb9947a31

16 months agoLU-17464 lod: set llc_ostlist to NULL after free
Bobi Jam [Wed, 24 Jan 2024 06:04:35 +0000 (14:04 +0800)]
LU-17464 lod: set llc_ostlist to NULL after free

Default LOV striping could free component entry llc_ostlist if needed
e.g. expand component entries, without set it to NULL it could be
double allocated/freed later.

Lustre-change: https://review.whamcloud.com/53797
Lustre-commit: TBD (from 5e7440b488050166af15e744dc74b9dc4f0d3b96)

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I25824cb61dd47ba284403039259593b88d25fa9d
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53798
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
16 months agoEX-9007 lipe: Fix getting client mount path
Vitaliy Kuznetsov [Tue, 23 Jan 2024 12:34:40 +0000 (13:34 +0100)]
EX-9007 lipe: Fix getting client mount path

This patch fixes an issue where when the client is not
mounted, size reports do not work.

Test-Parameters: trivial testlist=sanity-lipe-scan3
Signed-off-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Change-Id: I1e99fddf21960ecd14526c0d6baeb75c2a138dd8
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53763
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
16 months agoEX-9029 lfs: not iterate compr_type_table using ARRAY_SIZE
Bobi Jam [Wed, 24 Jan 2024 15:35:37 +0000 (23:35 +0800)]
EX-9029 lfs: not iterate compr_type_table using ARRAY_SIZE

EX-8311 patch modifies compr_type_table to contain NULL fields in the
array, so iterate over the array should not use ARRAY_SIZE, but skip
those elements with NULL compression type name.

Fixes: ec5814c9a7 ("EX-8311 csdc: allow specify 'fast'/'best' compression type")
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I8e4988fd3a63c1cb66f75510d190c2ebc4f8f9be
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53808
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
16 months agoLU-17468 lod: component add missed pattern info
Bobi Jam [Wed, 24 Jan 2024 17:08:33 +0000 (01:08 +0800)]
LU-17468 lod: component add missed pattern info

"lfs setstripe --commponent-add" missed setting component pattern,
which causes some setting missing, like overstriping, compression.

Lustre-change: https://review.whamcloud.com/53817
Lustre-commit: TBD (from 3849e3efdc58d535ee6858aafa22cfdc665ba2d7)

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I7ad746a550f1afea54a6f5b68823a79a85a44082
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53811
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
16 months agoLU-16307 tests: fix sanity-sec test_31
Sebastien Buisson [Tue, 23 Jan 2024 13:29:11 +0000 (14:29 +0100)]
LU-16307 tests: fix sanity-sec test_31

In order to improve sanity-sec test_31 resiliency, reorganize the way
the new LNet '999' is handled. And make sure everything is correctly
cleaned up after the test.

Lustre-change: https://review.whamcloud.com/53818
Lustre-commit: TBD (from f4a96799159fd662855542d471197ac4060d3295)

Test-Parameters: trivial testgroup=review-dne-part-2
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Idd657c7555e598d0ebc08387eac537b1c73e35bd
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53779
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>