Whamcloud - gitweb
fs/lustre-release.git
7 months agoLU-10465 lov: increase default stripe size to 4MB 18/37318/13
Andreas Dilger [Thu, 23 Jan 2020 20:15:10 +0000 (20:15 +0000)]
LU-10465 lov: increase default stripe size to 4MB

Increase the default stripe size from 1MB to 4MB for better
performance and reduced LDLM lock contention for larger writes.

This can also reduce the need to cache data on the client on a
striped file before a full RPC is generated, since the default
RPC size is 4MB, but with 1MB stripe size, the file would need
4x full stripe_count * stripe_size writes before an RPC is full.

Patch includes several test fixes:
- sanity-pfl: takes into account stripe size in some tests
- sanity-flr: use bigger component size and amount of data to
  saturate all stripes as expected by test
- sanity: 130g to use 1M stripe prior FIEMAP calcs
- sanity-lfsck: 36[a-c] to use 1M stripe as expected by calcs

Test-Parameters: testlist=sanity-compr
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I3cef8805247fc5253e0a0ac05157b9d609054df9
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/37318
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 months agoLU-17041 kernel: update RHEL 8.8 [4.18.0-477.21.1.el8_8] 03/52003/2
Jian Yu [Fri, 18 Aug 2023 21:28:43 +0000 (14:28 -0700)]
LU-17041 kernel: update RHEL 8.8 [4.18.0-477.21.1.el8_8]

Update RHEL 8.8 kernel to 4.18.0-477.21.1.el8_8.

Test-Parameters: trivial fstype=ldiskfs \
clientdistro=el8.8 serverdistro=el8.8 testlist=sanity

Test-Parameters: trivial fstype=zfs \
clientdistro=el8.8 serverdistro=el8.8 testlist=sanity

Change-Id: Ie24c8e438dd33afafb900664d9a4010160bc1a45
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52003
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 months agoLU-17096 debian: add obd_test.ko, llog_test.ko to lustre-tests 98/52398/8
Timothy Day [Sun, 17 Sep 2023 01:00:35 +0000 (21:00 -0400)]
LU-17096 debian: add obd_test.ko, llog_test.ko to lustre-tests

The obd_test.ko module was missing from the lustre-tests
Debian package. Hence, it wasn't being installed on the
Ubuntu clients during testing. This caused sanity/55a and
sanity/55b to consistently fail.

Add llog_test.ko to lustre-tests also. It's not unheard of to
use Ubuntu for Lustre server. So the package may as well include
llog_test.ko.

Also, update debian/.gitignore.

Test-Parameters: trivial testlist=sanity env=ONLY=55,ONLY_REPEAT=50 clientdistro=ubuntu2204
Test-Parameters: trivial testlist=sanity env=ONLY=55,ONLY_REPEAT=50
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I050de4563478996828886ca623fa96b58f9fef5e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52398
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Thomas Stibor <thomas@stibor.net>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
8 months agoLU-16661 build: remove -dev packages for Debian 81/52281/4
Andreas Dilger [Tue, 5 Sep 2023 20:18:31 +0000 (14:18 -0600)]
LU-16661 build: remove -dev packages for Debian

Don't depend on libmount-dev, libsnmp-dev, libkeytils-dev for the
lustre-client-utils and lustre-server-utils packages.  These are
only needed for build and for the lustre-client-dkms package.

Disable SNMP by default as this is no longer used anywhere.

Test-Parameters: trivial testlist=runtests clientdistro=ubuntu2204
Fixes: 7dc6e1128a ("LU-15888 build: Debian dkms-debs requires ed and libkeyutils")
Fixes: af2f77633b ("LU-13818 build: use libsnmp-dev instead of libsnmp30")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ib788a97028ee40a9c61070d00b823620ec3ebbe5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52281
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
8 months agoLU-16661 build: use "Recommends: perl" for lustre-iokit 25/52225/3
Jian Yu [Mon, 4 Sep 2023 07:04:50 +0000 (00:04 -0700)]
LU-16661 build: use "Recommends: perl" for lustre-iokit

In lustre-iokit, the "plot" commands all use perl, but
the actual "*-survey" scripts are written in bash, so
the "Requires: perl" in lustre.spec.in for lustre-iokit
could be downgraded to "Recommends: perl" for RHEL 8+
(RHEL 7 does not handle "Recommends:").

Test-Parameters: trivial testlist=obdfilter-survey

Change-Id: I55f3c57e73ac91cedce745dc4f424c3542978cd4
Fixes: 800a9ec58f78 ("LU-16661 build: improve lustre.spec.in Requires")
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52225
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-17000 utils: Fix Resourse leak under mount_utils.c 18/52218/3
Arshad Hussain [Fri, 1 Sep 2023 07:12:18 +0000 (12:42 +0530)]
LU-17000 utils: Fix Resourse leak under mount_utils.c

This patch fixes resource leak error reported
by coverity run.

CoverityID: 399700 ("Resource leak"): mount_utils.c

Test-Parameters: trivial testlist=conf-sanity
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: Ib3281d922936822a0ac298a15d6e8863b3c2c9b7
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52218
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-17000 ptlrpc: fix string overflow warnings 10/52210/4
Andreas Dilger [Thu, 31 Aug 2023 20:50:56 +0000 (14:50 -0600)]
LU-17000 ptlrpc: fix string overflow warnings

Fix potential string overflow warnings in sptlrpc_flavor2name()
calling strncat() with the full size of the target buffer
instead of the *remaining* space in the target buffer.

Fix potential string overflow warning in sepol_seq_write_old()
and sepol_seq_write() potentially copying an unterminated string
from userspace via strncpy() and not terminating it afterward.

Since the maximum incoming parameter size is known in advance,
is reasonably small (~342 bytes), and is only used temporarily,
reorganize the code to avoid two buffer allocations and copies.
Use memcpy() to copy the string since its length is known, and
always add a NUL terminator to the string afterward.

Improvements to error messages and code style in these functions.

Addresses-Coverity: 199034 ("Out-of-bounds access")
Addresses-Coverity: 199063 ("Out-of-bounds access")
Addresses-Coverity: 199108 ("Out-of-bounds access")
Addresses-Coverity: 397374 ("String not null terminated")
Addresses-Coverity: 397394 ("String not null terminated")

Test-Parameters: trivial testlist=sanity-sec,sanity-selinux
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ia810ce9f07b663a90049bb78af21c06f0e3ebbe5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52210
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-16796 obd: Change struct obd_device to use kref 79/52179/2
Arshad Hussain [Wed, 30 Aug 2023 09:39:58 +0000 (15:09 +0530)]
LU-16796 obd: Change struct obd_device to use kref

This patch changes struct obd_device to use
kref(refcount_t) instead of atomic_t

Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: Ia8539abb11357b41edd4cf532896d3bc1e66e92f
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52179
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-16796 mdt: Change struct cdt_agent_record_loc to use kref 53/52153/3
Arshad Hussain [Tue, 29 Aug 2023 07:41:51 +0000 (13:11 +0530)]
LU-16796 mdt: Change struct cdt_agent_record_loc to use kref

This patch changes struct cdt_agent_record_loc to use
kref(refcount_t) instead of atomic_t

Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I99141b00b4cfc7b4b46a87462b9ce21735bb0e7d
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52153
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-17015 obdclass: make upcall cache hashtable size dynamic 28/52128/3
Sebastien Buisson [Mon, 28 Aug 2023 09:37:51 +0000 (11:37 +0200)]
LU-17015 obdclass: make upcall cache hashtable size dynamic

The hash table used by the upcall cache mechanism should have an
adjustable size, depending on the purpose and context where it is
used.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I53c5cb14f9a5630fc269d97cead9a5ca6a33895e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52128
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Aurelien Degremont <adegremont@nvidia.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-16743 lod: create stripe with correct attr 52/52052/6
Lai Siyao [Mon, 21 Aug 2023 22:47:33 +0000 (18:47 -0400)]
LU-16743 lod: create stripe with correct attr

lod_xattr_set_lmv() create directory stripe with master object attr,
but it shouldn't change attr->la_valid, otherwise bogus data may be
set on stripe object.

Zfs osd_create() copies attr to object directly, clear la_flags if
LA_FLAGS is not set in la_valid.
_
Test-Parameters: trivial
Test-Parameters: mdscount=2 mdtcount=4 mdtfilesystemtype=zfs testlist=sanity
Test-Parameters: mdscount=2 mdtcount=4 mdtfilesystemtype=zfs testlist=sanity
Test-Parameters: mdscount=2 mdtcount=4 mdtfilesystemtype=zfs testlist=sanity
Test-Parameters: mdscount=2 mdtcount=4 mdtfilesystemtype=zfs testlist=sanity
Test-Parameters: mdscount=2 mdtcount=4 mdtfilesystemtype=zfs testlist=sanity
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I8385f36bd2eee0e55cbe6bd031b0e013cda40e06
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52052
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-16838 tests: use import name in 398a 64/51064/3
Patrick Farrell [Fri, 19 May 2023 16:24:25 +0000 (12:24 -0400)]
LU-16838 tests: use import name in 398a

The LU-15670 test change assumes ost1_import is always
OST0000.  This isn't quite always true, so the test is
failing in certain configurations.

Change it to use the import name.

Fixes: 649d638467 ("LU-15670 clio: Disable lockless for DIO with O_APPEND")
Test-parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ifaefc503d1118ecd6fd45b661cbe94607f7ad799
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51064
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
8 months agoLU-16834 obdfilter: Do not attach device if already present 34/51034/5
Arshad Hussain [Wed, 17 May 2023 09:17:24 +0000 (05:17 -0400)]
LU-16834 obdfilter: Do not attach device if already present

Running obdfilter-survey where "case=disk" and targets are
repeated with same OST's names. obdfilter-survey throws
"error: attach: File exists". This is because the on the first
iteration the attach and setup is already done and subsequently
the attach fails as the device/uuid is already present.

Test-Parameters: trivial testlist=obdfilter-survey
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I8ab9ea905ec86b9e1aa8906bebcc38fee0fdbc23
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51034
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-16521 tests: allow + separator for racer 66/49866/10
Elena Gryaznova [Wed, 31 May 2023 19:42:57 +0000 (22:42 +0300)]
LU-16521 tests: allow + separator for racer

The Test-Parameters: line parses ',' even with quotes, so it cannot
be used as a separator in Autotest for RACER_PROGS and RACER_EXTRA.

Allow both ',' and '+' as a separator for both RACER_PROGS and
RACER_EXTRA tasks so specific racer tasks can be run.

Do not always enable dir_remote and dir_migrate if RACER_PROGS set.

Test-Parameters: trivial testlist=racer env=RACER_PROGS=file_rename+file_truncate,RACER_EXTRA=file_create:5+dir_create:5+dir_remote:5
Fixes: 6d9e74580e ("LU-14274 tests: enhance racer to set extra layout")
Signed-off-by: Elena Gryaznova <elena.gryaznova@hpe.com>
HPE-bug-id: LUS-11466
Change-Id: I3f3b4da6f76ccfac2680068184dc4714187a9a4d
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49866
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
8 months agoLU-15233 llite: Remove extra cl_page_delete call 83/45583/4
Patrick Farrell [Mon, 15 Nov 2021 22:14:07 +0000 (17:14 -0500)]
LU-15233 llite: Remove extra cl_page_delete call

"LU-5108 osc: Performance tune for LRU" added a call to
cl_page_delete to the page discard code used by the OSC
lru shrinker.

This seems to have been a mistake.  cl_page_discard causes
page invalidation, which calls ll_invalidatepage, which
calls cl_page_delete if the page can be found.

Since the page is locked here and ll_invalidatepage checks
for the cl_page, this extra call to cl_pege_delete has
probably never caused an issue.

But it's extraneous and kind of weird, and misled me a bit
when working on another bug.  Let's remove it.

Fixes: b117bc837c02 ("LU-5108 osc: Performance tune for LRU")
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I1380f532359ba949a0bbb8b53227a6c8e6491030
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/45583
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-16943 tests: use primary ost1 server in replay-single/135 58/52058/3
Jian Yu [Thu, 24 Aug 2023 00:56:05 +0000 (17:56 -0700)]
LU-16943 tests: use primary ost1 server in replay-single/135

This patch fixes replay-single test_135() to make sure
the primary ost1 server is used at the beginning of the test.

Test-Parameters: trivial env=REPLAY_SINGLE_EXCEPT=200 testlist=replay-single

Test-Parameters: trivial env=REPLAY_SINGLE_EXCEPT=200,FAILURE_MODE=HARD \
    clientcount=4 mdtcount=1 mdscount=2 osscount=2 \
    austeroptions=-R failover=true iscsi=1 \
    testlist=replay-single,mmp

Fixes: 1b73b6465b77 ("LU-16943 tests: fix replay-single/135 under hard failure mode")
Change-Id: Ia25314255c9f00ba71687e1f757517f37031caed
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52058
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-12518 llite: rename count and nob variables to bytes 54/38154/14
Andreas Dilger [Thu, 31 Aug 2023 16:26:35 +0000 (12:26 -0400)]
LU-12518 llite: rename count and nob variables to bytes

Rename "*count", "*nob", and "cnt" and similar variables to use
"*bytes" to make it clear what the units are vs. number of pages.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I195f2db4182e4b3099b3f4aa2e25b91f9f3ebbe5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/38154
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
8 months agoLU-8585 llite: add special fid handling for fhandle API 07/51707/11
James Simmons [Thu, 10 Aug 2023 01:59:28 +0000 (21:59 -0400)]
LU-8585 llite: add special fid handling for fhandle API

Lustre has been moving its FIDs handling to the fhandle API. This
works well for normal files but Lustre has special FIDs that don't
map to normal files which are used by user land applications. Add
special handling to ll_iget_for_nfs() so the fhandle API can work
with these special FIDs. These FIDs should also work with filesets.

Change-Id: I4b55d96cc9eea0b1fb898f94c071c8b30c7b2bd5
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51707
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-6142 misc: update headers in config, debian, rpm 06/52106/3
Timothy Day [Sun, 27 Aug 2023 00:19:52 +0000 (00:19 +0000)]
LU-6142 misc: update headers in config, debian, rpm

Update the file header to have the SPDX license and
use the standard format.

Fix minor style issues with comments in a few files.
Remove `dnl` from m4 files.

Files that are uncertain are left as NOASSERTION
for the license identifier. This makes no claim
about the file. It is used to track files so they
can be addressed later.

https://spdx.github.io/spdx-spec/v2-draft/package-information/#75-package-supplier-field

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I212ce05a4292bbb0d71372d9d75880ce45a219f3
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52106
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-16847 ldiskfs: reduce a memory usage by ost IO threads 92/51392/6
Alexey Lyashkov [Wed, 21 Jun 2023 09:10:27 +0000 (12:10 +0300)]
LU-16847 ldiskfs: reduce a memory usage by ost IO threads

page array is useless once lnb array added it might addressed
via lnb->lnb_page, let's remove it and reduce memory consumption.

Signed-off-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Change-Id: Ieb0c186e27f56c770fd2ebfbddce9ccf19791611
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51392
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-16847 ldiskfs: refactor code brw_stats code. 91/51391/9
Alexey Lyashkov [Tue, 20 Jun 2023 14:01:03 +0000 (17:01 +0300)]
LU-16847 ldiskfs: refactor code brw_stats code.

counting a number disk or logical extents don't
needs a loop.
All information exist around of ldiskfs_map_blocks.

HPe-bug-id: LUS-11645
Signed-off-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Change-Id: I77f3707b88e9bdf6ea06acc950af2a41f056f5d0
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51391
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-930 doc: add static analysis documentation 86/52186/2
Timothy Day [Wed, 30 Aug 2023 17:59:20 +0000 (17:59 +0000)]
LU-930 doc: add static analysis documentation

Add more documentation about Clang/LLVM and other
static analysis tools for Lustre. This will make it
easier for other developers to try out various tools.
It will also serve as a place to record best practices
and experiences. Hopefully, this will increase awareness
and usage of these various tools and improve the Lustre
codebase as a result.

This patch also has a few other small doc updates.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I4bd860775729aaa4ef1ae1cc2cceb6435f3affdd
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52186
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-16796 mdt: Change struct cdt_agent_req to use kref 48/52148/3
Arshad Hussain [Tue, 29 Aug 2023 07:00:35 +0000 (12:30 +0530)]
LU-16796 mdt: Change struct cdt_agent_req to use kref

This patch changes struct cdt_agent_req to use
kref(refcount_t) instead of atomic_t

Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I0a99002504ff453b8b748391f08bd1020c545321
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52148
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-17058 build: add help and checkpatch as make targets 42/52142/3
Timothy Day [Mon, 28 Aug 2023 19:52:04 +0000 (19:52 +0000)]
LU-17058 build: add help and checkpatch as make targets

Add `make help` to print out available make targets. The
output is styled after the Linux kernel `make help`.
Add `make checkpatch` to run checkpatch.pl script
against most recent commit.

Update README to mention `make help`.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I65ce84040502994ae7caa0c8b72d808442f6b79e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52142
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-13031 tests: skip sanity/test_205h,205i in interop 95/52095/2
Thomas Bertschinger [Fri, 25 Aug 2023 16:28:16 +0000 (12:28 -0400)]
LU-13031 tests: skip sanity/test_205h,205i in interop

Skip sanity tests 205h and 205i when the MDS version is too old
to have the jobid xattr changes. Fix test 103a to not try to set
the job_xattr parameter when it does not exist.

Fixes: 23a2db28dcf1 ("LU-13031 jobstats: store jobid in xattr when files are created")
Test-Parameters: trivial testlist=sanity env=ONLY="103a 205h 205i"
Signed-off-by: Thomas Bertschinger <bertschinger@lanl.gov>
Change-Id: Iaa5d0c1a7f3fa6769fab4340ade315e7a49df009
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52095
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
8 months agoLU-16831 lod: replace (__u16)-1 with LOV_ALL_STRIPES 55/52055/3
Jian Yu [Wed, 23 Aug 2023 19:05:22 +0000 (12:05 -0700)]
LU-16831 lod: replace (__u16)-1 with LOV_ALL_STRIPES

This patch replaces "(__u16)-1" with constant LOV_ALL_STRIPES
and replaces "(__u64)-1" with OBD_OBJECT_EOF.

Test-Parameters: trivial

Change-Id: I2345c5e67da20328e7173c9add8da27015df9d13
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52055
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Zhenyu Xu <bobijam@hotmail.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
8 months agoLU-10026 csdc: DoM pattern could be a combined value 78/51978/5
Bobi Jam [Thu, 17 Aug 2023 17:00:03 +0000 (01:00 +0800)]
LU-10026 csdc: DoM pattern could be a combined value

DoM pattern is LOV_PATTERN_MDT for now, and in the future it could
be combined with LOV_PATTERN_COMPRESS to represent a compressed
DoM component.

Fix a minor glitch for lov_getstripe_old code path (in
ll_lov_getstripe_ea_info), which intends to return the last component
stripe info but the commit abf04e7ea3 omits to correctly set the
last component stripe info before using it.

Fixes: abf04e7ea3 ("LU-14337 lov: return valid stripe_count/size for PFL files")
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Id0779c30c004b6979f88bf96b7b7b74a8b8c26e4
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51978
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
8 months agoLU-17023 krb: use a Kerberos realm different from default 14/51914/10
Sebastien Buisson [Thu, 10 Aug 2023 11:05:52 +0000 (13:05 +0200)]
LU-17023 krb: use a Kerberos realm different from default

It makes sense to give the ability to specify a Kerberos realm that is
different from the default realm as returned by
krb5_get_default_realm().

On client side, the desired realm needs to be specified via the new
'-R' option to lgss_keyring. This can be specified in the config file
/etc/request-key.d/lgssc.conf to replace the default domain, e.g.:
create lgssc * * /usr/sbin/lgss_keyring -R DOMAIN.COM %o %k %t %d %c %u %g %T %P %S

On server side, the desired realm can be specified via the new '-R'
parameter of the lsvcgssd daemon, replacing the default realm.

This patch adds sanity-krb5 test_1b to exercise the new realm options,
by just re-using the same realm as the test system is configured to
use. And former test_1 is renamed test_1a.

Test-Parameters: kerberos=true testlist=sanity-krb5
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I9c91d5cb9904781d546e77b1e46115fed433618f
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51914
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Aurelien Degremont <adegremont@nvidia.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-17050 tests: test Kerberos env in sanity-krb5 68/52068/4
Sebastien Buisson [Thu, 24 Aug 2023 09:40:46 +0000 (11:40 +0200)]
LU-17050 tests: test Kerberos env in sanity-krb5

Test Kerberos environnement is sane before trying to launch
sanity-krb5 tests.

Test-Parameters: trivial kerberos=true testlist=sanity-krb5
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I1675ba7db8c62687c69359a15cc931b5dfd40018
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52068
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Aurelien Degremont <adegremont@nvidia.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-17000 lnet: fix various bugs in lib-move.c 76/51876/5
Timothy Day [Sat, 5 Aug 2023 19:12:15 +0000 (19:12 +0000)]
LU-17000 lnet: fix various bugs in lib-move.c

In lnet_select_peer_ni, best_lpni_is_preferred is often written
to before being immediately overwritten.

Addresses-Coverity-ID: 397646 ("Unused value")
Addresses-Coverity-ID: 397434 ("Unused value")

Both LNetPut and LNetGet were not freeing msg under certain failure
conditions. This leaks a small amount of memory each time it occurs.

Addresses-Coverity-ID: 397644 ("Resource leak")
Addresses-Coverity-ID: 397133 ("Resource leak")

Fix potential null dereference in lnet_find_best_ni_on_local_net
when best_lp gets defined by best_net doesn't.

Addresses-Coverity-ID: 397568 ("Explicit null dereferenced")

lnet_post_send_locked has an un-needed null check, since every path
leading to that block of code must dereference ni anyway.

Addresses-Coverity-ID: 397278 ("Dereference before null check")

In the other usage of msg_peerrtrcredit, it is accessed under
lpni_lock. Change the second usage to also be accessed under this
lock.

Addresses-Coverity-ID: 397606 ("Data race condition")

Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I4012f407b10d0c9644535d49cce83a6c95d3d22d
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51876
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-16988 mdd: update projid when merging layout 59/51859/5
Hongchao Zhang [Mon, 14 Aug 2023 07:28:17 +0000 (15:28 +0800)]
LU-16988 mdd: update projid when merging layout

When creating mirrors by the special directory ".lustre/fid",
the project ID could not be set correctly, which causes
wrong quota calculation for the projid.

Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Change-Id: Ia4c3a8973b8c467642e12629d36fa42d64162084
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51859
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sergey Cheremencev <scherementsev@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-16977 utils: access_log_reader accesses beyond batch array 54/51754/4
Alexandre Ioffe [Tue, 25 Jul 2023 02:08:26 +0000 (19:08 -0700)]
LU-16977 utils: access_log_reader accesses beyond batch array

Fixed access_log_reader accesses sorted batch array beyond upper
boundary when batch-fraction 100%: consider fraction = 100% as a
special case, which requires no sorting and filtering.
Use a separate thread function to process 100% fraction case.
Made some minor changes using enum type to nicefy the code.

Test-Parameters: trivial testlist=sanity
Signed-off-by: Alexandre Ioffe <aioffe@ddn.com>
Change-Id: Iba1734b17dc901875f343c793688aec17b9f7a93
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51754
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-16763 obdclass: add unit tests for OBD life cycle 03/51103/7
Timothy Day [Fri, 12 May 2023 04:32:53 +0000 (04:32 +0000)]
LU-16763 obdclass: add unit tests for OBD life cycle

Add some simple OBD life cycle tests. These tests
consist of a kernel module which defines a simple OBD
device, and a few sanity tests. The new OBD device
print logs validating that it has been loaded
correctly. Unlike other OBD devices, this one has
minimal side-effects. The new test OBD device has
been added to the test rpm and dkms.

sanity/55a aims to test that a device can loaded
properly and found by the various OBD device search
functions.

sanity/55b aims to load the maximum number of allowed
OBD devices, which is currently 8192. It also times how
long it takes to perform the loading and unloading. In
the future, this could be used to test for performance
regression.

The tests avoid using any userspace function, like lctl
or lfs, since I noticed bugs when using them with a large
number of devices. Follow-up patches will include fixes
and more testing.

I used a variation of these tests when debugging
sanity/60a failures, and when debugging removing
MAX_OBD_DEVICES.

This test (obd_test.c) and the llog test (llog_test.c)
should probably be moved to a different directory in a
follow-up patch.

Test-Parameters: trivial testlist=sanity env=ONLY=55,ONLY_REPEAT=25
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Ibc347ac962c59a4bbc26410c30f9cc5529e6c84d
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51103
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-14111 tests: only support recovery-small test 146 for 2.15.54+ 98/52098/4
James Simmons [Fri, 25 Aug 2023 22:42:23 +0000 (18:42 -0400)]
LU-14111 tests: only support recovery-small test 146 for 2.15.54+

If you running newer clients with older servers (2.15.3) then
recovery-small test 146 will fail since the old servers lack
the new sysfs file eviction_count.

Fixes: 3c69d46e176 ("LU-14111 obdclass: count eviction per obd_device")
Test-Parameters: trivial testlist=recovery-small env=ONLY=146 mdsversion=2.15.3
Change-Id: I53f6dabd305ec920e8de1d9fde407b2f2c15ba69
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52098
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
8 months agoLU-16314 llite: Prefer %pK with seq_printf 12/51212/7
Shaun Tancheff [Thu, 24 Aug 2023 09:14:09 +0000 (04:14 -0500)]
LU-16314 llite: Prefer %pK with seq_printf

Update procfs and sysfs users to prefer %pK to when
printing pointers so that when kptr_restrict is set to 1
a real pointer value is provided.

To enable printing non-hashed pointer values:
  sysctl -w kernel/kptr_restrict=1

This change also sets kptr_restrict to 1 for all clients
and servers under test by test framework.

Test-Parameters: trivial
HPE-bug-id: LUS-10945
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Iccfce1399648e752cb7b78afc75aacbfb0bde390
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51212
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-16043 osc: allow error for write on CL_FSYNC_DISCARD 32/48032/4
Vladimir Saveliev [Wed, 26 Jul 2023 13:09:18 +0000 (16:09 +0300)]
LU-16043 osc: allow error for write on CL_FSYNC_DISCARD

If case of CL_FSYNC_DISCARD error is allowed for write of osc object.

Otherwise, the included test fails in rm with:
  (osc_page.c:174:osc_page_delete()) Trying to teardown failed: -16
  (osc_page.c:175:osc_page_delete()) ASSERTION( 0 ) failed:
  (osc_page.c:175:osc_page_delete()) LBUG

Test-Parameters: trivial testlist=sanity env=ONLY=907
HPE-bug-id: LUS-10410
Signed-off-by: Vladimir Saveliev <vladimir.saveliev@hpe.com>
Change-Id: I0aae0dc470ba0371964e7643a6d84b19a1b4e106
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48032
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-15619 osc: Rename brw_page members 15/46715/4
Patrick Farrell [Wed, 30 Aug 2023 20:05:07 +0000 (16:05 -0400)]
LU-15619 osc: Rename brw_page members

The brw_page members have generic names - add a structure
related tag to the names.

test-parameters: trivial

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I98e6f874902074934eb01476a9595f502526bc38
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/46715
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-8130 osc: convert osc_quota hash to xarray 38/32038/76
James Simmons [Mon, 28 Aug 2023 14:03:03 +0000 (10:03 -0400)]
LU-8130 osc: convert osc_quota hash to xarray

The cl_quota_hash originally had 3 hashes, one for each type of quota
(USR, GRP, PRJ) that just stored on the client whether a particular
quota ID was over its limit. This was overkill since cl_quota_hash
only needs one bit to check if a particular ID has exceeded quota
with IO from this client, and there will usually be only a few IDs
that are actually exceeding their limit where a client is involved.
Instead, use the quota ID as the index into an Xarray, and store
a value with the quota TYPE(s) that are over the limit for that ID.
We only need to test the presence/absence of an ID and a quota type
without the need to store any additional values (the clients do not
track the actual quota usage or limits).
To test if a quota is exceeded for particular ID is a two-step
process. First check if there is any entry for the particular ID,
and if it exists then check which quota type (USR, GRP, PRJ) is
over the limit for that ID value.  The same is done when setting
a particular quota ID/TYPE is over its limit - first lookup the
ID and then add the TYPE flag to the value if not already set.
The Xarray implementation does offer using "marks" (up to 3 bits
per index) but in this case there is no other value that needs to
be stored into the Xarray other than one bit for any exceeded type,
so they are not used here.

Change-Id: I9355ed2a7158f0d5cc0d600ad51ea1a1434f3e98
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/32038
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
8 months agoLU-17052 libcfs: fix build for old kernel 90/52090/4
Xinliang Liu [Fri, 25 Aug 2023 03:24:12 +0000 (03:24 +0000)]
LU-17052 libcfs: fix build for old kernel

Fix build for kernel v4.17 to v4.19.
These old kernels already have xarray.h and #include by fs.h but
don't have full xarray support. It is needed to #include libcfs's
xarray.h also to contain xarray support.

Rename the header define macro to ensure libcfs's xarray.h will be
included。

Test-Parameters: trivial
Test-Parameters: testlist=sanityn envdefinitions=ONLY=77,ONLY_REPEAT=20
Fixes: 778791dd7da1 ("LU-8130 libcfs: don't use radix tree for xarray")
Change-Id: I760c394cc1d885c2de79d1770243ab7f292b9b3a
Signed-off-by: Xinliang Liu <xinliang.liu@linaro.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52090
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
8 months agoLU-16973 osd: adds SB_KERNMOUNT flag 31/51731/6
Alexander Boyko [Fri, 7 Jul 2023 19:35:51 +0000 (15:35 -0400)]
LU-16973 osd: adds SB_KERNMOUNT flag

During umount mntput() is called. It uses delayed_mntput()
function, and it could take much time to finish. A block
device is occupied during delayed work.

[ 8753.941980] Lustre: server umount XXX complete
[ 8800.129136] sysrq: SysRq : Trigger a crash

PID: 319306   TASK:XXXX   CPU: 2    COMMAND: "kworker/2:0"
 #0 __schedule at ffffffff9754e1d4
 #1 preempt_schedule_common at ffffffff9754e6fa
 #2 _cond_resched at ffffffff9754e72d
 #3 invalidate_mapping_pages at ffffffff96e72da5
 #4 invalidate_bdev at ffffffff96f5d13c
 #5 ldiskfs_put_super at ffffffffc1c82e34 [ldiskfs]
 #6 generic_shutdown_super at ffffffff96f1bdcc
 #7 kill_block_super at ffffffff96f1bed1
 #8 deactivate_locked_super at ffffffff96f1b784
 #9 cleanup_mnt at ffffffff96f3b86b

Let's use SB_KERNMOUNT flag during mount, it leads to
synchronous mntput().
It also calls flush_delayed_fput during umount to finish
delayed fput.

HPE-bug-id: LUS-11629
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: Ia6729f6cbac85c3626562e946a4b96665a143714
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51731
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-12064 ptlrpc: set at_min=5 by default 09/50609/4
Andreas Dilger [Sat, 18 Mar 2023 00:34:06 +0000 (18:34 -0600)]
LU-12064 ptlrpc: set at_min=5 by default

Having at_min=0 as the default value can result in clients timing
out and/or being evicted too easily when there is a sudden spike
in server load.  Increase at_min to 5s by default.

For large clusters, at_min=15 is more reasonable, but distributing
a variable at_min value to clients will need more complex changes.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I3463cbc642458f6dd5977fe34478b135d1cd0219
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50609
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Aurelien Degremont <adegremont@nvidia.com>
8 months agoLU-13805 llite: add mm to dio struct 47/49947/31
Patrick Farrell [Wed, 8 Feb 2023 19:01:24 +0000 (14:01 -0500)]
LU-13805 llite: add mm to dio struct

When copying to or from userspace, we must use the mm from
the userspace thread.  This can be done either by running
in that thread or borrowing its mm.  Unaligned DIO does
some memory movement to userspace in ptlrpcd threads, so it
requires the user mm be stored in the sub dio.

This will be used by the main unaligned DIO patch and has
been split out for reviewability.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I419cb9f1899b8c8f9790ce25b3aba1d6f07397aa
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49947
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Zhenyu Xu <bobijam@hotmail.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
8 months agoLU-13805 clio: Add csi_complete 13/49913/35
Patrick Farrell [Mon, 6 Feb 2023 18:10:10 +0000 (13:10 -0500)]
LU-13805 clio: Add csi_complete

The next patch will make end_io potentially sleep, so we
need to modify how completion works to avoid holding a
spinlock over the end_io() call.

This patch is strictly supporting work for the next patch
and has been pulled out so it can be tested by itself.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Iba3388a0e09fdd0ab2f4a95f1cde96908a485cfa
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49913
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoNew tag 2.15.58 2.15.58 v2_15_58
Oleg Drokin [Fri, 1 Sep 2023 20:38:40 +0000 (16:38 -0400)]
New tag 2.15.58

Change-Id: I6d58a43d5904c24d32575b4790bcaabd9ebdfb6f
Signed-off-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-17038 tests: remove unused compile.sh script 54/52054/2
Timothy Day [Wed, 23 Aug 2023 16:26:41 +0000 (16:26 +0000)]
LU-17038 tests: remove unused compile.sh script

This script just runs make automatically. It doesn't
appear to be called by any other Lustre sanity
test script. I doubt it has been used in many
years. This patch removes it.

Checked for usage using:

 `git grep -i "compile.sh"`

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: If1615196bc8d004a63ad8baddd1d3fe3e360dc74
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52054
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-17038 tests: remove mlink utility 51/52051/4
Timothy Day [Wed, 23 Aug 2023 02:52:32 +0000 (02:52 +0000)]
LU-17038 tests: remove mlink utility

The mlink utility is nearly identical to the link utility
provided by coreutils. They only differ by some GNU
boilerplate. All tests using mlink are replaced with link.
Luckily, mlink is only used in a few places.

Used the following command:

 `git grep -i mlink | grep -i -v symlink`

to track down all uses of mlink.

Test-Parameters: trivial testlist=recovery-small
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I197235572d2cb267ee68930c64058e4f5ffe5be1
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52051
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-12678 lnet: discard lnet_kvaddr_to_page 41/52041/3
Mr NeilBrown [Wed, 23 Aug 2023 00:18:41 +0000 (20:18 -0400)]
LU-12678 lnet: discard lnet_kvaddr_to_page

This function is not needed, so discard it.

Change-Id: Iffe9745adf477a5f4b78d8ef191849179426cb07
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52041
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-17043 enc: fix osd lookup cache for long encrypted names 16/52016/2
Sebastien Buisson [Mon, 21 Aug 2023 09:44:32 +0000 (11:44 +0200)]
LU-17043 enc: fix osd lookup cache for long encrypted names

Fix osd lookup cache to support files with long encrypted names.
Those encrypted names can be up to 256 bytes, not NUL terminated.

Fixes: 29f8eb2a67 ("LU-16405 osd: lookup cache")
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Ica2329c8a0990395307a14fe9bb9d43db3b364ed
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52016
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-15367 llite: iotrace standardization 02/52002/3
Patrick Farrell [Fri, 18 Aug 2023 18:31:32 +0000 (14:31 -0400)]
LU-15367 llite: iotrace standardization

Clean up and standardize some of the iotrace messages for
easier parsing.

Add a clear 'START' indicator.

Remove a now-redundant debug message in the mmap code.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ia620cc8c783509cbc3f47b21a274d67d860b80e7
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52002
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
8 months agoLU-17039 build: cleanup ib_dma_map_sg 79/51979/2
Shaun Tancheff [Fri, 18 Aug 2023 04:50:56 +0000 (23:50 -0500)]
LU-17039 build: cleanup ib_dma_map_sg

CONFIG_INFINIBAND_VIRT_DMA is a kernel configuration option
that in some cases conflicts with the configuration of the
externally provided OFED stack.

During configure when ib_dma_map_sg fails to build correctly
we can simply #undef CONFIG_INFINIBAND_VIRT_DMA to resolve
the inconsistent configuration that breaks ib_dma_map_sg

Test-Parameters: trivial
HPE-bug-id: LUS-11771
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Id0849464d3ffbd573cac13016191d80c6ea991af
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51979
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-17038 tests: remove munlink utility 77/51977/4
Andreas Dilger [Thu, 17 Aug 2023 22:06:36 +0000 (16:06 -0600)]
LU-17038 tests: remove munlink utility

The munlink utility is obsoleted by the unlink command added in
the coreutils package many moons ago, and can be removed.  All
tests using munlink are replaced with unlink.

Test-Parameters: trivial testlist=recovery-small,replay-dual,replay-single
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I984406525ed958814bd8af74a2d81c4920e320b0
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51977
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-16510 build: check if CONFIG_FORTIFY_SOURCE is defined 73/51973/2
Jian Yu [Thu, 17 Aug 2023 20:46:38 +0000 (13:46 -0700)]
LU-16510 build: check if CONFIG_FORTIFY_SOURCE is defined

The linux/fortify-string.h header file should not be
included while the kernel config option CONFIG_FORTIFY_SOURCE
is not defined.

Change-Id: I2e1905406e892b182f143d512a2d3722b141e52d
Fixes: 919b93b951d4 ("LU-16510 build: fortified memcpy from linux 6.1")
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51973
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
8 months agoLU-17036 utils: make sure resize option is legit 70/51970/2
Li Dongyang [Thu, 17 Aug 2023 13:27:00 +0000 (23:27 +1000)]
LU-17036 utils: make sure resize option is legit

To align the metadata on 1MB boundaries we manually
set the resize blocks to 16368G for 4K block size,
however mke2fs expects the resize blocks is bigger
than device size.

For devices between 16368G and 16384G the mke2fs
will fail with:
The resize maximum must be greater than the filesystem size.

Change-Id: I4567a79c1405e9527d7f0f9bec4c8a7aae0eba6c
Test-Parameters: trivial
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51970
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-17031 build: fix refefine __compiletime_strlen error 53/51953/2
Qian Yingjin [Wed, 16 Aug 2023 02:11:39 +0000 (22:11 -0400)]
LU-17031 build: fix refefine __compiletime_strlen error

Lustre build failed on Ubuntu 2204 kernel v5.17 with "redefine
__compiletime_strlen".
This patch fixes this build error.

Fixes: 919b93b951 ("LU-16510 build: fortified memcpy from linux 6.1")
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: Ic26daecd6b91614e01b5b0030f40eede205a21f7
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51953
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-17030 llite: allow setting max_cached_mb to a % 52/51952/7
Patrick Farrell [Tue, 15 Aug 2023 23:08:12 +0000 (19:08 -0400)]
LU-17030 llite: allow setting max_cached_mb to a %

Lustre's max_cached_mb parameter is hard to use because it
must be set to a specific numeric value, so in effect it
cannot be set on the server side unless all clients are
guaranteed identical.

Let's add the ability to set that to a % of memory to make
it more useful.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I1f9f5a8a5d671ab00b7ab6133bb9b1d1214ca59e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51952
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-10885 docs: note flock now being enabled by default 48/51948/2
Laura Hild [Tue, 15 Aug 2023 17:04:37 +0000 (13:04 -0400)]
LU-10885 docs: note flock now being enabled by default

mount -o flock was made the default, but the mount.lustre(8) man-page
still said noflock is default.  Text based on comments in LU-10885 and
http://wiki.lustre.org/Mounting_a_Lustre_File_System_on_Client_Nodes.

Signed-off-by: Laura Hild <lsh@jlab.org>
Change-Id: I48bfc0260fb948771f5cf4fb8cbc6ee9588e2217
Test-Parameters: trivial
Fixes: 16fb13eb3863 ("LU-10885 llite: enable flock mount option by default")
Fixes: 3613af3e15cb ("LU-10885 llite: enable flock mount option by default")
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51948
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-17015 gss: support large kerberos token on client 46/51946/6
Aurelien Degremont [Tue, 15 Aug 2023 14:03:07 +0000 (16:03 +0200)]
LU-17015 gss: support large kerberos token on client

If the current Kerberos setup is using large token, like
when PAC feature is enabled for Kerberos, client can crash.

Return an error instead of asserting to avoid the crash
and increase the default buffer size to 4kB instead of 1kB.
This will only increase the SEC_CTX_INIT request size, and
the buffer is shrunk before being sent over the wire.

This will allow security token up to 2kB to be properly
handled by Lustre. Above that size, a different issue will
happen on server side that will require another patch.

Test-Parameters: trivial kerberos=true testlist=sanity-krb5
Signed-off-by: Aurelien Degremont <adegremont@nvidia.com>
Change-Id: I9ce30ee7f8c95bfe41525c49986ffac45ffac97c
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51946
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
8 months agoLU-17006 lnet: set up routes for going across subnets 21/51921/4
Serguei Smirnov [Fri, 11 Aug 2023 00:58:11 +0000 (17:58 -0700)]
LU-17006 lnet: set up routes for going across subnets

Modify ksocklnd-config to set up route which features
default gateway for the subnet in case if default gateway
is defined, for example:
        ip route add default via <gw_for_eth0> dev eth0 table eth0
which results in a route similar to the following added to
the eth0 route table:
        default via <gw_for_eth0> dev eth0

If there's no gateway found for the eth0 subnet, keep the old
behaviour which results in the following added to eth0
route table:
        <eth0_subnet> dev eth0 proto kernel scope link src <eth0_ip>

This makes sure that MR traffic goes out the intended interface
as selected by LNet no matter whether going across subnets or not.

Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: I84a299c8b7eb4cdb4fc24408a1e42ad0283d9219
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51921
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-16766 obdclass: trim kernel thread names in jobids 19/51919/2
Thomas Bertschinger [Thu, 13 Jul 2023 22:32:52 +0000 (18:32 -0400)]
LU-16766 obdclass: trim kernel thread names in jobids

When collecting jobstats on operations coming from kernel threads, it
is more useful and reduces the noisiness of the data if the names of
kernel threads are trimmed so that all "kworker/CPU:ID" threads are
collected under "kworker", all "ll_sa_PID" threads under ll_sa, etc.

Signed-off-by: Thomas Bertschinger <bertschinger@lanl.gov>
Change-Id: Icd82a99c1153de0277ea5ed3f4b1d92535809921
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51919
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-17020 kernel: update RHEL 9.2 [5.14.0-284.25.1.el9_2] 86/51886/4
Jian Yu [Tue, 8 Aug 2023 22:43:03 +0000 (15:43 -0700)]
LU-17020 kernel: update RHEL 9.2 [5.14.0-284.25.1.el9_2]

Update RHEL 9.2 kernel to 5.14.0-284.25.1.el9_2.

Test-Parameters: trivial fstype=ldiskfs \
clientdistro=el9.2 serverdistro=el9.2 testlist=sanity

Test-Parameters: trivial fstype=zfs \
clientdistro=el9.2 serverdistro=el9.2 testlist=sanity

Change-Id: Icdbd9cfa18a72d3e6f09f366952e6e0f2ac1ebd2
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51886
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-17013 lov: fill FIEMAP_EXTENT_LAST flag 63/51863/9
Lei Feng [Thu, 3 Aug 2023 09:44:15 +0000 (17:44 +0800)]
LU-17013 lov: fill FIEMAP_EXTENT_LAST flag

If file has N extents and get the fiemap with exactly N
extent slots, the last extent will miss FIEMAP_EXTENT_LAST
flag. Fix it.

Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: testlist=sanityn env=ONLY=71a+71b+71c
Change-Id: I4556b31f0d04bdf8e83f323e83b871b093beaa5e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51863
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Zhenyu Xu <bobijam@hotmail.com>
8 months agoLU-17011 utils: monotonic clock in lfs mirror 52/51852/4
Alex Zhuravlev [Wed, 2 Aug 2023 10:31:57 +0000 (13:31 +0300)]
LU-17011 utils: monotonic clock in lfs mirror

use monotonic clocks instead of realtime to avoid affecting
bandwidth or hanging the transfer if the clock is changed.

Test-Parameters: trivial
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I58cf327d235448e93fa2ed63cefdf4dd01306e71
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51852
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
8 months agoLU-17009 tests: fix runtests to read file name with backslash 47/51847/2
Jian Yu [Wed, 2 Aug 2023 07:16:04 +0000 (00:16 -0700)]
LU-17009 tests: fix runtests to read file name with backslash

If a file in /etc dir has a name with backslash, then runtests
will fail because the read command considers the backslash as
an escape character. This patch fixes the issue by adding "-r"
option to read.

Change-Id: Iab912ba9708f5b64e6bb8d8adc266ff23ed32de5
Test-Parameters: trivial testlist=runtests
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51847
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sarah Liu <sarah@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-17000 lnet: remove redundant errno check in liblnetconfig.c 46/51846/3
Jake McManus [Thu, 10 Aug 2023 03:12:03 +0000 (23:12 -0400)]
LU-17000 lnet: remove redundant errno check in liblnetconfig.c

Variable root is assigned NULL at the beginning of
lustre_lnet_show_stats(). If l_ioctl() fails, its return value
stored in rc will take the True path in the following conditional.
This conditional currently contains a redundant check for errno,
despite the fact that rc would = -errno in this case. If errno had
changed between the l_ioctl() call and this subsequent read, errno
could be 0, which would, from the out: label, lead to a NULL
root being used as a parameter in cYAML_insert_sibling() and
dereferencing the NULL root pointer.

Replaced l_errno's use as a parameter in strerror with -rc, and
removed decleration and other references to l_errno.

Addresses-Coverity-ID: 397850 ("Explicit null dereferenced")

Signed-off-by: Jake McManus <jacobpmcmanus@gmail.com>
Change-Id: I78f080837b60c8216c52bda8562d4c0f9f45a132
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51846
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-16866 tests: Use wait_update to check LNet recovery state 45/51845/4
Chris Horn [Mon, 31 Jul 2023 19:03:57 +0000 (13:03 -0600)]
LU-16866 tests: Use wait_update to check LNet recovery state

The monitor thread is somtimes woken up on demand and sometimes sleeps
for one second intervals. This makes it hard to precisely predict how
long we need to sleep for ping counts to update and NIs to be
processed out of recovery.
Use wait_update when checking LNet recovery queues and ping counts.
Additional drop rules are added to tests 210 and 211 because it has
been observed that other test instances may issue pings to the node
running 210/211 and cause the ping_count to reset. These additional
drop rules ensure that any incoming messages are dropped.

Test-Parameters: trivial
Test-Parameters: testlist=sanity-lnet env=ONLY=210,211,216
Test-Parameters: testlist=sanity-lnet env=ONLY=211,ONLY_REPEAT=100
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Ief84388222e46c23952af4ad1d098924e73a8598
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51845
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-17000 misc: remove Coverity annotations 93/51793/2
Timothy Day [Fri, 28 Jul 2023 04:49:50 +0000 (04:49 +0000)]
LU-17000 misc: remove Coverity annotations

These Coverity function annotations were added
around 10 years ago. Since then, Coverity seems
to produce less false positives. Out of about 20
annotations, only 3 warnings get surpressed.
Thus, the applicability of these annotations
should be re-evaluated.

Coverity has more advanced tools now for reducing
false positives. Various Lustre functions and
macros could be modeled rather than using
function annotations. But first, we need to get
a good idea of what kinds of false postives are
being generated.

https://scan.coverity.com/tune

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Ibcb9cf55574675e20b13a4f7a1b9142a3b75e262
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51793
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-16984 tests: replay-dual/31 checks file from DIR2 62/51762/2
Lei Feng [Wed, 26 Jul 2023 00:52:10 +0000 (08:52 +0800)]
LU-16984 tests: replay-dual/31 checks file from DIR2

In replay-dual/test_31, check file existence from DIR2.
Add more messages for diagnosis.

Fixes: 07764c4eeb ("LU-16953 tests: wait longer in replay-dual/test_31")
Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: trivial testlist=replay-dual env=ONLY=31,ONLY_REPEAT=100
Change-Id: Iee679ee94ac2cb51baad1651bfaddf452fafdbd1
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51762
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-16961 clang: plugins and build system integration 59/51659/4
Timothy Day [Thu, 13 Jul 2023 04:19:41 +0000 (04:19 +0000)]
LU-16961 clang: plugins and build system integration

Clang has a plugin system. Compiler extensions can be created
by making a shared library and loading it via the "-fplugin"
options. This makes it simple to implement custom warnings
and static analyzers.

This patch adds a plugin to detect functions that should have
been made static. This plugin has been run over the majority
of the Lustre tree and patches have been submitted for all
warnings. The plugin did not return any false positives in
my testing.

It also add the "--enable-compiler-plugins" configure option,
which automatically builds and sets up the in-tree C compiler
plugins. The option force-enables the plugin regardless of
which compiler is in use. This behavior could be changed if
there is ever a need to support GCC specific plugins.

Also, add the configure checks needed to support building C++
in the Lustre tree. Clang and GCC plugins (and the compilers
themselves) are written in C++.

The license for the plugin mirrors that of the LLVM project
itself. This leaves the door open for contributing this
plugin upstream in the future. This isn't being upstreamed
now because it lacks any significant user community. Hence,
the plugin does not appear to meet the requirements for
upstreaming based on https://clang.llvm.org/get_involved.html.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I747ed91b53e765cc58e91a3eb9ec6c12b9908a96
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51659
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-16605 lfs: Add -n option to fid2path 26/51626/12
Arshad Hussain [Tue, 11 Jul 2023 05:55:36 +0000 (11:25 +0530)]
LU-16605 lfs: Add -n option to fid2path

Add '-n' option to fid2path to allow printing
only the filename of the file instead of the
whole parent pathname.

Test-case sanity/226d added.

Test-Parameters: trivial testlist=sanity
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: Ieebd39a1655b4e3ad20cdbb4941dbb44882845f4
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51626
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-16943 tests: fix replay-single/135 under hard failure mode 74/51574/6
Jian Yu [Wed, 12 Jul 2023 13:48:41 +0000 (21:48 +0800)]
LU-16943 tests: fix replay-single/135 under hard failure mode

This patch fixes replay-single test_135() to load libcfs module
on the failover partner node to avoid 'fail_val' setting error.
It also fixes the issue that not all of the OSTs are mounted after
failing back ost1.

Test-Parameters: trivial env=REPLAY_SINGLE_EXCEPT=200 testlist=replay-single
Test-Parameters: trivial env=REPLAY_SINGLE_EXCEPT=200 fstype=zfs testlist=replay-single

Test-Parameters: trivial env=REPLAY_SINGLE_EXCEPT=200,FAILURE_MODE=HARD \
    clientcount=4 mdtcount=1 mdscount=2 osscount=2 \
    austeroptions=-R failover=true iscsi=1 \
    testlist=replay-single

Change-Id: Id46c722a6db9d832829a739f41f7462b32a6d9d9
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51574
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-16936 auster: add --client-only option 09/51509/4
Timothy Day [Thu, 29 Jun 2023 15:37:21 +0000 (15:37 +0000)]
LU-16936 auster: add --client-only option

Add flag to auster to run sanity tests only on the
client-side. This leverages some existing functionality
to avoid having to setup ssh to filesystem hosts and
some other tedious setup.

Force test-framework.sh to honor the --no-setup flag.
Several test suites attempt to setup Lustre even if
auster says not to. Some lower level tests, like those
related to OBD device loading, require Lustre to be
not setup.

Change some [ to [[ in test-framework.sh to silence
some error messages.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I24de10743c3845b51fe29518ffc993b15a7c2cdd
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51509
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-16883 ldiskfs: update for ext4-delayed-iput for RHEL9.0 76/51376/2
Shaun Tancheff [Tue, 20 Jun 2023 07:31:53 +0000 (14:31 +0700)]
LU-16883 ldiskfs: update for ext4-delayed-iput for RHEL9.0

ext4-delayed-iput patch does not apply cleanly to RHEL9.0

Adjust the minor conflict in ext4_put_super()

Test-Parameters: trivial
Fixes: 616fa9b581 ("LU-15404 ldiskfs: use per-filesystem workqueues to avoid deadlocks")
HPE-bug-id: LUS-11661
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Ia8c2dcda50417b113399973f177a14283514a1ff
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51376
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-16896 flr: resync should not change file size 44/51344/6
Bobi Jam [Sat, 17 Jun 2023 00:51:26 +0000 (08:51 +0800)]
LU-16896 flr: resync should not change file size

mirror resync could punch a hole reaching the end of file in a
mirror, which could change the file size when the mirror is referred.

This patch calls truncate after punch in this case to keep the file
size unchanged in the mirror.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Ia0fc1f220a32a60f3516c69e86867796ae5c35c7
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51344
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-16906 build: Server for newer SUSE 15 SP3 kernels 38/51338/5
Shaun Tancheff [Tue, 15 Aug 2023 09:06:48 +0000 (04:06 -0500)]
LU-16906 build: Server for newer SUSE 15 SP3 kernels

Update the SUSE 15 SP3 server support for newer kernels
including LTSS series kernels.

Add a new ldiskfs patch series for updated SUSE 15 SP3
kernels with a updated ext4-pdirop.patch

Test-Parameters: trivial
HPE-bug-id: LUS-11676
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I0acf81abfcc71a64dc09a344a9231d86a44f193e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51338
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-16477 ldiskfs: Add ext4-enc-flag patch for SUSE 15 SP5 45/51945/3
Shaun Tancheff [Tue, 15 Aug 2023 12:40:50 +0000 (07:40 -0500)]
LU-16477 ldiskfs: Add ext4-enc-flag patch for SUSE 15 SP5

Include ext4-enc-flag for linux 5.14 in the 5.14 based SUSE 15 SP5
ldiskfs series.

Test-Parameters: trivial
HPE-bug-id: LUS-11442
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: If73c1665d5623f90d6908b049eb27755952b03f0
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51945
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
8 months agoLU-16821 llite: report 1MiB directory blocksize 60/50960/4
Andreas Dilger [Thu, 11 May 2023 17:49:52 +0000 (11:49 -0600)]
LU-16821 llite: report 1MiB directory blocksize

Report st_blksize=1048576 for directories so that glibc readdir()
will allocate a larger buffer to match the MDS_READDIR size
and reduce the number of syscalls for large dirs.

Signed-off-by: Andreas Dilger <adilger@dilger.ca>
Change-Id: If64057c20ecc35194c319d2a88c3036f12c41ed5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50960
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
8 months agoLU-16816 obdclass: make import_event more robust 15/50915/4
Sebastien Buisson [Wed, 10 May 2023 12:49:00 +0000 (14:49 +0200)]
LU-16816 obdclass: make import_event more robust

Make mdc_import_event and osc_import_event more robust, by not
assuming input variables can be dereferenced.

Test-Parameters: trivial
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I31a6477d58b7bb9a557ea561f7b0fa3fbcae5762
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50915
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-16232 script: fix the argument parse 76/50876/4
Yang Sheng [Sat, 6 May 2023 07:16:17 +0000 (15:16 +0800)]
LU-16232 script: fix the argument parse

The issue makes script skip other arguments if
the special parameter is not last one.

Test-Parameter: trival

Fixes: b533700add (LU-16232 scripts: changelog/updatelog emergency cleanup)
Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: Ia309e7b6f1a62e76b80851848601c3d0b03be8b2
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50876
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-9859 libcfs: discard cfs_gettok and cfs_str2num_check 44/50844/3
Mr NeilBrown [Thu, 24 Aug 2023 14:32:40 +0000 (10:32 -0400)]
LU-9859 libcfs: discard cfs_gettok and cfs_str2num_check

cfs_gettok() and cfs_str2num_check() are no longer used in the kernel,
so remove them.

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I49a8378f049a936a742681293db616f7eb9b11af
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50844
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-16552 test: add new lnet test for Multi-Rail setups 02/50302/19
James Simmons [Sun, 13 Aug 2023 15:02:33 +0000 (11:02 -0400)]
LU-16552 test: add new lnet test for Multi-Rail setups

You can crash lnet kernel module by setting up a interface with
lctl net up and then attempting to setup the interface with
the import function. This is due to improper clearing the net_cpts
array.

Currently sanity-lnet.sh doesn't real test MR setups. Because of
this a few bugs slipped in. Add two new test to ensure MR setups
behave properly. Test 107 is to see if deleting a second interface
for a MR setup doesn't crash a node. Test 108 creates a multi rail
setup of a tcp LNet net with two interfaces, one real and the
other fake. A bug was preventing the second fake interface from
being added.

Test-Parameters: trivial testlist=sanity-lnet
Change-Id: Ic69e14bd0617f4d6fe931140b5b6d43b795843cf
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50302
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-16374 ldiskfs: implement security.encdata xattr 56/49456/13
Sebastien Buisson [Tue, 20 Dec 2022 14:40:52 +0000 (15:40 +0100)]
LU-16374 ldiskfs: implement security.encdata xattr

security.encdata is a virtual xattr containing information related
to encrypted files. It is expressed as ASCII text with a "key: value"
format, and space as field separator. For instance:

   { encoding: base64url, size: 3012, enc_ctx: YWJjZGVmZ2hpamtsbW
   5vcHFyc3R1dnd4eXphYmNkZWZnaGlqa2xtbg, enc_name: ZmlsZXdpdGh2ZX
   J5bG9uZ25hbWVmaWxld2l0aHZlcnlsb25nbmFtZWZpbGV3aXRodmVyeWxvbmdu
   YW1lZmlsZXdpdGg }

'encoding' is the encoding method used for binary data, assume name
can be up to 255 chars.
'size' is the clear text file data length in bytes.
'enc_ctx' is encoded encryption context, 40 bytes for v2.
'enc_name' is encoded encrypted name, 256 bytes max.
So on overall, this xattr is at most 727 chars plus terminating '0'.

On get, the value of the security.encdata xattr is computed from
encrypted file's information.
On set, encrypted file's information is restored from xattr value.
The encrypted name is stored temporarily in a dedicated xattr
LDISKFS_XATTR_NAME_RAWENCNAME, that will be used to set correct name
at linkat.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Ia318c39d403b1c448e71bcd5b29862d022d05d0a
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49456
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-16235 hsm: check CDT state before adding actions llog 42/48842/6
Nikitas Angelinas [Tue, 12 Jul 2022 17:15:36 +0000 (20:15 +0300)]
LU-16235 hsm: check CDT state before adding actions llog

Don't allow HSM requests to be added to the actions llog when
cdt_state is in CDT_STOPPED/CDT_STOPPING as the CDT is unavailable, or
in CDT_INIT as any HSM requests in the llog may not have been fully
processed and so cdt_last_cookie may not have been set appropriately,
otherwise a colliding cookie value can be reused in
mdt_agent_record_add() and the assertions in
cdt_agent_record_hash_add() can be triggered:

"ASSERTION( carl0->carl_cat_idx == carl1->carl_cat_idx ) failed"
"ASSERTION( carl0->carl_rec_idx == carl1->carl_rec_idx ) failed"

Requests needed to implement the Remove Archive on Last Unlink (RAoLU)
policy are allowed when the CDT is shutdown, as those are safe
operations. They are also allowed during CDT initialization, even
though this can lead to the assertions being triggered, as doing so
maintains administrator expectations regarding file archives always
being removed when the RAoLU policy is enabled. This could possibly be
improved by e.g. failing when mdt_handle_last_unlink() is not able to
add an HSM remove request, or saving the requests in an llog so they
can be sent if the CDT is available later.

For the same reason, the llog needs to be processed before setting
cdt_state to CDT_RUNNING in the coordinator thread.

Change-Id: I4b5f5ee22f74827b31d8ed5917a8fc16e35d1f16
Signed-off-by: Nikitas Angelinas <nikitas.angelinas@hpe.com>
HPE-bug-id: LUS-8231, LUS-11064
Fixes: e26d7cc3 ("LU-14399 hsm: process hsm_actions in coordinator")
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48842
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Etienne AUJAMES <eaujames@ddn.com>
8 months agoLU-15526 mdt: enable remote PDO lock 33/46733/5
Lai Siyao [Fri, 7 Jul 2023 15:06:02 +0000 (11:06 -0400)]
LU-15526 mdt: enable remote PDO lock

Once parent directory is located on remote MDT, enqueue two locks like
local PDO lock if it's locked in LCK_PW mode. With this change,
creating directories (either local or remote) under one directory will
hardly trigger commit-on-sharing (unless their PDO hashes equal).

Updated sanityn 33c.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I28030c45fbf137f5912863ae5eacfc8372db6754
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/46733
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-13730 tests: add file mirroring to racer 68/41368/3
Andreas Dilger [Fri, 29 Jan 2021 21:01:05 +0000 (14:01 -0700)]
LU-13730 tests: add file mirroring to racer

Add "lfs mirror extend" to racer to add mirrors to existing files.

Test-Parameters: trivial testlist=racer,racer,racer
Test-Parameters: fstype=zfs testlist=racer,racer,racer
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Iaa64ed2de54533838ce955f88a1be592923ebbe5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/41368
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Zhenyu Xu <bobijam@hotmail.com>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-14361 statahead: add statahead advise IOCTL 25/48625/16
Qian Yingjin [Thu, 22 Sep 2022 09:24:14 +0000 (05:24 -0400)]
LU-14361 statahead: add statahead advise IOCTL

This patch reuse ioctl(LL_IOC_LADVISE2) for statahead advise.
This allows userspace programs to advise the kernel statahead
of the order that they will be traversing a directory, so that
the client can prefetch inode attributes from the MDT, similar
to what posix_fadvise(POSIX_FADV_SEQUENTIAL) does for file data.

After patched mdtest via adding this statahead IOCTL hint, it
can support mdtest benchmark with regularized file naming format:
mdtest.$rank.$i
The usage of this statahead advise IOCTL could be as follows:
open(dir);
ioctl(dir_fd, IOC_LADVISE2, ...);
stat mdtest.0.0;
stat mdtest.0.1;
stat mdtest.0.2;
stat mdtest.0.3;
...
clsoedir(dir);

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: Iac38e33bfc6d7a0b755c2646ba8053a263e3afc9
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48625
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-14156 utils: mirror split to check for last in-sync early 82/40782/47
Alex Zhuravlev [Fri, 27 Nov 2020 11:00:46 +0000 (14:00 +0300)]
LU-14156 utils: mirror split to check for last in-sync early

currently this check to prevent last in-sync component is done
once the file is open with O_RDWR which interrupts on-going
resync/extend process. instead we can do this check early once
the layout is fetched (after the first open with O_RDONLY).

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Iee08d23008b44d2a7b2127358116a95ace40b7dd
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/40782
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Zhenyu Xu <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
8 months agoLU-12645 llite: Move readahead debug before exit 32/51932/5
Patrick Farrell [Fri, 11 Aug 2023 22:01:26 +0000 (18:01 -0400)]
LU-12645 llite: Move readahead debug before exit

The core debug of ll_readahead() is before two return
conditions, which makes it really tricky to debug those
conditions.

Let's fix that.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ic3a3854527cad62c891c6a25029353a4742e555f
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51932
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-6142 lov: cleanup unneeded macros from lov_request.c 14/52014/2
Timothy Day [Sun, 20 Aug 2023 04:10:27 +0000 (04:10 +0000)]
LU-6142 lov: cleanup unneeded macros from lov_request.c

One macro defines a custom U64_MAX. The other adds
together two numbers, capping the sum at U64_MAX.
These macros are only used in a couple places. The
logic would be clearer and more concise without them.

Also, fix an incorrect comment.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I31012fbddba459df909c27cde8c59461f013c3be
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52014
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-6142 ptlrpc: Fix style issues for layout.c 26/51926/3
Arshad Hussain [Wed, 9 Aug 2023 04:30:01 +0000 (10:00 +0530)]
LU-6142 ptlrpc: Fix style issues for layout.c

This patch fixes issues reported by checkpatch
for file lustre/ptlrpc/layout.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: Ib482495ede6264dd3d42f90dbc50606487fd0b52
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51926
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-6142 ptlrpc: Fix style issues for events.c 88/51888/4
Arshad Hussain [Tue, 8 Aug 2023 10:15:13 +0000 (15:45 +0530)]
LU-6142 ptlrpc: Fix style issues for events.c

This patch fixes issues reported by checkpatch
for file lustre/ptlrpc/events.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I8a49e3c9216a042ca157bde3b82a06918f3f6554
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51888
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-16847 ldiskfs: refactor code. 90/51390/6
Alexey Lyashkov [Tue, 20 Jun 2023 12:23:56 +0000 (15:23 +0300)]
LU-16847 ldiskfs: refactor code.

unused parameters should removed to reduce a stack usage.
iobuf is common struct in io path now.

Test-Parameters: trivial
HPe-bug-id: LUS-11645
Signed-off-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Change-Id: Ie4d68ff7548f049de8706ac5b0e3f62eb15a211a
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51390
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
8 months agoLU-16077 ptlrpc: Fix ptlrpc_body_v2 with pb_uid/pb_gid 22/51122/3
Etienne AUJAMES [Wed, 24 May 2023 13:26:27 +0000 (15:26 +0200)]
LU-16077 ptlrpc: Fix ptlrpc_body_v2 with pb_uid/pb_gid

ptlrpc_body_v2 and ptlrpc_body_v3 should have the same fields except
for jobid.

This patch fixes the debug request messages by printing request
uid/gid at the end. That way debugging tools can still parse message
for newer versions.

Fixes: 0544c10 ("LU-16077 tbf: pb_uid/pb_gid ptlrpc_body fields for TBF rules")
Test-Parameters: testlist=sanityn env=ONLY=77,ONLY_REPEAT=20
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Change-Id: I1faa13fa7c5b03bfeeb7cd75f7dbbfa8ca8ca941
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51122
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
8 months agoLU-16827 obdfilter: Fix obdfilter-survery/1a 35/51035/12
Arshad Hussain [Thu, 24 Aug 2023 05:58:14 +0000 (01:58 -0400)]
LU-16827 obdfilter: Fix obdfilter-survery/1a

local_node() under test-framework is used
to determine if the node is remote or local
local_node() returns "true" if the node is
local. Else for remote node it return "false"

This patch fixes obdfilter/1a test case which
which was making reverse logic call to
local_node() to determine remote/local node

This patch modifies local_node() to return
"true"/"false" instead of 0/1

This patch also replaces lctl with $LCTL

Test-Parameters: testlist=obdfilter-survey
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I7bcb483975ec46d9847e0050e5a1f22f68663c80
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51035
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-11457 osd-ldiskfs: scrub FID reuse 01/51601/7
Lai Siyao [Fri, 7 Jul 2023 09:21:05 +0000 (05:21 -0400)]
LU-11457 osd-ldiskfs: scrub FID reuse

It's possible that two inodes back point to the same FID, check
inodes in osd_scrub_check_update() to decide which mapping
should be kept:
* if one inode doesn't exist, its mapping is stale.
* if one inode mtime is after the other one, keep this mapping.
* if two inode mtimes equal, and one inode size is not 0, keep its
  mapping, otherwise two inode sizes are 0, just keep the existing
  mapping.

Remove IDIF support in osd_scrub_check_update() to simplify
code logic.

Add sanity-scrub 4e to verify it.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Ida020c2852c66f1a8910845bd16ab4c882858a4e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51601
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-16097 tests: skip quota subtests in interop 09/52009/11
Andreas Dilger [Fri, 18 Aug 2023 21:55:10 +0000 (21:55 +0000)]
LU-16097 tests: skip quota subtests in interop

Skip subtests in sanity-quota.sh to avoid interop test failures,
backdated to check all new tests since 2.14.0 for completeness.

Test-Parameters: trivial testlist=sanity-quota ossversion=2.15.3
Test-Parameters: testlist=sanity-quota mdsversion=2.15.3
Fixes: 513b1cdbca ("LU-16340 quota: notify only global lqe")
Fixes: d4978678b4 ("LU-15694 quota: keep grace time while setting default")
Fixes: 25a70a88c9 ("LU-13952 quota: default OST Pool Quotas")
Fixes: 188112fc80 ("LU-14300 quota: avoid nested lqe lookup")
Fixes: 8c19365416 ("LU-13971 quota: report Pool Quotas for a user")
Fixes: a4fbe7341b ("LU-14739 quota: nodemap squashed root cannot bypass quota")
Fixes: 3ffa5d680f ("LU-14740 llite: avoid project quota overflow")
Fixes: 29e00cecc6 ("LU-14696 llite: check read only mount for setquota")
Fixes: 789038c97a ("LU-15167 quota: fallocate send UID/GID for quota")
Fixes: 5fc934ebbb ("LU-15519 quota: fallocate does not increase projid usage")
Fixes: c9901b68b4 ("LU-13587 quota: protect qpi in proc")
Fixes: 61ec1e0f2c ("LU-15031 quota: reseed glbe in qmt_lvbo_udate")
Fixes: dfe7d2dd2b ("LU-16341 quota: fix panic in qmt_site_recalc_cb")
Fixes: 862f0baa7c ("LU-15097 quota: stop pool_recalc before killing pool")
Fixes: 61481796ac ("LU-15193 quota: expand QUOTA_MAX_TRANSIDS to 12")
Fixes: a2fd4d3aee ("LU-15880 quota: fix insane grant quota")
Fixes: 6c0b4329d0 ("LU-16339 quota: notify OSTs until lge_qunit_nu is set")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ife8bfd83d0f217c534f3b12b4c9d108d370ed6b7
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52009
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sergey Cheremencev <scherementsev@ddn.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-13306 mgs: support large NID for mgs_write_log_osc_to_lov 53/52053/5
James Simmons [Wed, 23 Aug 2023 14:46:34 +0000 (10:46 -0400)]
LU-13306 mgs: support large NID for mgs_write_log_osc_to_lov

The various llogs on the MGS needed to be updated to support both
64 bit NID size and the newer large NID format. The function
mgs_write_log_osc_to_lov was missed in this update.

Test-Parameters: trivial testlist=runtests ossversion=2.15.3
Fixes: c0cb747ebe9 ("LU-13306 mgs: use large NIDS in the nid table on the MGS")
Change-Id: If543a0421d1f3cac9827581ce46da911c3456efd
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52053
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
8 months agoLU-16541 tests: Improve test 64f 40/52040/5
Patrick Farrell [Tue, 22 Aug 2023 16:32:52 +0000 (12:32 -0400)]
LU-16541 tests: Improve test 64f

The buffered IO part of test 64f has several timing related
holes and other oddities.  The use of multiop in the
background does not guarantee the RPC will not be sent, AND
the test doesn't kill it correctly.

Clean this up and make a more reliable version of the test.
Hopefully this will resolve the failure issues, if not, a
better version of the test will allow debugging.

Test-Parameters: trivial
Test-Parameters: testlist=sanity envdefinitions=ONLY=64f,ONLY_REPEAT=20
Test-Parameters: testlist=sanity envdefinitions=ONLY=64f,ONLY_REPEAT=20
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I25b825e1d9d516635ef8cbd26dd12809625c34df
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52040
Reviewed-by: xinliang <xinliang.liu@linaro.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
8 months agoLU-17005 obdclass: allow stats header to be disabled 23/51823/2
Andreas Dilger [Mon, 31 Jul 2023 19:34:22 +0000 (13:34 -0600)]
LU-17005 obdclass: allow stats header to be disabled

Add a global "enable_stats_header" tunable parameter that can be
set to enable/disable the "start_time" and "elapsed_time" fields
in the standard lprocfs "stats" files.

Default to enabled, since this landed shortly after v2_14_0.

Test-Parameters: trivial
Fixes: 5efb892396e3 ("LU-11407 obdclass: add start time to stats files")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I460b957447bfb83e6d4fd7395b79ce994f3ebbe5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51823
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Feng Lei <flei@whamcloud.com>
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-16341 tests: skip sanity-quota/test_14 for old MDS 49/51949/2
Alex Deiter [Tue, 15 Aug 2023 18:47:51 +0000 (22:47 +0400)]
LU-16341 tests: skip sanity-quota/test_14 for old MDS

Skip sanity-quota test_14 for old MDS missing the fix
for LU-16341 kernel NULL in qmt_site_recalc_cb.

Fixes: d965d63415 ("LU-16341 quota: fix panic in qmt_site_recalc_cb")
Test-Parameters: trivial testlist=sanity-quota env=ONLY=14
Signed-off-by: Alex Deiter <adeiter@tintri.com>
Change-Id: I1a23daa06f0cd306c2b034df18617c2650945b28
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51949
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 months agoLU-17027 target: include linux/file.h 43/51943/2
Xinliang Liu [Tue, 15 Aug 2023 07:58:14 +0000 (07:58 +0000)]
LU-17027 target: include linux/file.h

In some 4.x kernels like 4.19 we need to include linux/file.h to
have alloc_file_pseudo() defined.

Change-Id: Ieee8d5ac5b080bd3b5c761f54a5ca2f9581ecfe1
Test-Parameters: trivial
Fixes: ac0380dc519a ("LU-137 osd-ldiskfs: pass through resize ioctl")
Signed-off-by: Xinliang Liu <xinliang.liu@linaro.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51943
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>