Whamcloud - gitweb
fs/lustre-release.git
19 months agoLU-14668 lnet: add 'force' option to lnetctl peer del 49/50149/5
Serguei Smirnov [Mon, 27 Feb 2023 23:41:19 +0000 (15:41 -0800)]
LU-14668 lnet: add 'force' option to lnetctl peer del

Add --force option to 'lnetctl peer del' command.
If the peer has primary NID locked, this option allows
for the peer to be deleted manually:
lnetctl peer del --prim_nid <nid> --force

Add --prim_lock option to 'lnetctl peer add' command.
If specified, the primary NID of the peer is locked
such that it is going to be the NID used to identify
the peer in communications with Lustre layer.

Test-Parameters: trivial
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: Ia6001856cfbce7b0c3288cff9b244b569d259647
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50149
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
19 months agoLU-16591 build: use vendor tag for LUSTRE_FIX 36/50136/3
Andreas Dilger [Fri, 24 Feb 2023 19:15:40 +0000 (12:15 -0700)]
LU-16591 build: use vendor tag for LUSTRE_FIX

Improve the regexp for the LUSTRE_FIX (fourth) component of the
LUSTRE_VERSION parsing so that if there is an additional component
in the version like "2.15.2-abc123" the "123" will be reported as
the fourth component of the version in obd_connect_data.ocd_version.
If there is already a fix version specified (e.g. 2.15.2.4-abc123)
then it will continue to be used for LUSTRE_FIX instead of the extra
"-abcNN" value.

The build version is shown in "{mdt,obdfilter}.*.exports.*.export"
param on servers for connected clients, and in "{mdc,osc}.*.import"
param on clients for connected servers.  Displaying the full version
improves debuggability of remote peers to know their specific build
instead of showing the first three digits and always ".0" at the end.

Since ocd_version is a numeric field it is not possible to include
the "abc" part of the version string on the peer nodes.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I96b38f99b2522c5ea3f3b3e2ddd7cd64f1ce7057
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50136
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: Shuichi Ihara <sihara@ddn.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Shuichi Ihara <sihara@ddn.com>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
19 months agoLU-10360 ldlm: remove client_import_find_conn() 00/50000/3
Mr NeilBrown [Mon, 9 May 2022 01:34:05 +0000 (11:34 +1000)]
LU-10360 ldlm: remove client_import_find_conn()

This function hasn't been used since Commit 37be05eca3f4 ("LU-10360
mgc: Use IR for client->MDS/OST connections").
So remove it.

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Idd68c2de1aba914e9017e9e8c10fbbe869ea5b26
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50000
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: jsimmons <jsimmons@infradead.org>
19 months agoLU-16518 ptlrpc: fix clang build errors 59/49859/3
Timothy Day [Sun, 19 Feb 2023 21:21:09 +0000 (21:21 +0000)]
LU-16518 ptlrpc: fix clang build errors

Fixed bugs which cause errors on Clang.

The majority of changes involve adding
defines for the 'ptlrpc_nrs_ctl' enum.
This avoids having to explicitly cast
enums from one type to another.

An unused variable 'req' was removed from
'nrs_tbf_req_get'. A 'strlcpy' in
'sptlrpc_process_config' was copying the
wrong number of bytes. Another variable,
'rc' in 'sptlrpc_lproc_init', seemed to
be neglected unintentionally; this was also
fixed.

Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: If994c625199b392198f944f9cd21bbf2142bce69
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49859
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: jsimmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
19 months agoLU-14824 test: sanity 413a/b unlink timeout v2 99/49799/13
Lai Siyao [Wed, 22 Dec 2021 12:26:49 +0000 (07:26 -0500)]
LU-14824 test: sanity 413a/b unlink timeout v2

Unlinking remote/striped directories is slow on ZFS system, limit
total directory number for 1-stripe directory test in 413a/b on ZFS
system, and don't test striped directory to avoid timeout.

Also limit total stripe object count to avoid timeout.

Use fallocate to fill the MDTs if it is supported.  Add a new helper
function check_fallocate() that just determines if fallocate support
is available on the OST without trying to change the mode.  This is
cached across calls to avoid repeated SSH calls to get information
that is the same for the entire test run.

Test-Parameters: trivial
Test-Parameters: mdscount=2 mdtcount=4 fstype=ldiskfs testlist=sanity env=ONLY="413a 413b",ONLY_REPEAT=50
Test-Parameters: mdscount=2 mdtcount=4 fstype=zfs testlist=sanity env=ONLY="413a 413b",ONLY_REPEAT=50
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Ie116e6df5aee3877ed9f093f58e7bd71f63ebbe5
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49799
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
19 months agoLU-15626 tests: Fix "error" reported by shellcheck (1/5) 35/49435/7
Arshad Hussain [Wed, 22 Jun 2022 09:57:53 +0000 (15:27 +0530)]
LU-15626 tests: Fix "error" reported by shellcheck (1/5)

This patch fixes "error" issues reported by shellcheck
for file lustre/tests/test-framework.sh. This patch also
moves spaces to tabs.

Change-Id: I2ef9e79856f3c86d71e5078c78ae309d48a9d71f
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49435
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: jsimmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Aurelien Degremont <degremoa@amazon.com>
19 months agoLU-15944 lnet: remove crash with UDSP 01/48801/7
Cyril Bordage [Thu, 29 Sep 2022 15:19:19 +0000 (17:19 +0200)]
LU-15944 lnet: remove crash with UDSP

The following sequence of commands caused a crash:
  # lnetctl udsp add --dst tcp --prio 1
  # lnetctl discover 192.168.122.60@tcp
Pointer to lnet_peer_net in udsp_info is checked before used.

Comments about syntax of "lnetctl udsp" command were updated.

Test-Parameters: trivial testlist=lnet-selftest,sanity-lnet
Signed-off-by: Cyril Bordage <cbordage@whamcloud.com>
Change-Id: Ie3ae40b184e22627655e7f3813c5d16d38a6cfb8
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48801
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
19 months agoLU-16575 lnet: memory leak in copy_ioc_udsp_descr 81/50081/4
Chris Horn [Tue, 21 Feb 2023 16:37:56 +0000 (10:37 -0600)]
LU-16575 lnet: memory leak in copy_ioc_udsp_descr

copy_ioc_udsp_descr() doesn't correctly handle the case where a
net number was not specified. In this case, there isn't any net
number range that needs to be copied into the udsp descriptor.

Test-Parameters: trivial testlist=sanity-lnet env=ONLY=400
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Ica900e2b0ec816237a8303d1d9e07cc1f6c5a652
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50081
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
19 months agoLU-14958 kernel: use rhashtable for revoke records in jbd2 22/45122/56
Alex Zhuravlev [Tue, 11 Oct 2022 11:59:48 +0000 (14:59 +0300)]
LU-14958 kernel: use rhashtable for revoke records in jbd2

resizable hashtable should improve journal replay time when
the latter has got million of revoke records. notice that
rhashtable is used during replay only as removal with list_del()
is less expensive and it's used a lot during regular processing.

before:
1048576 records - 95 seconds
2097152 records - 580 seconds

after:
1048576 records - 2 seconds
2097152 records - 3 seconds
4194304 records - 7 seconds

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I9a9e3801223fa9e36cbf6d2ef5ddbad5dff3e19d
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/45122
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: jsimmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
19 months agoLU-15004 osd: remove wrong assertion on oo_dn 11/44911/30
Alex Zhuravlev [Tue, 14 Sep 2021 09:50:24 +0000 (12:50 +0300)]
LU-15004 osd: remove wrong assertion on oo_dn

which is invalid since  LU-14531 landing. instead we just
check that the object does exist and return an error otherwise.
all this is serialized against object destroy.

Fixes: 51350e9b738d("LU-14531 osd: serialize access to object vs object destroy")

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Icff39987f4f8d5a8227c6e3b829b58979c1b1941
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/44911
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
19 months agoLU-14668 lnet: don't delete peer created by Lustre 65/43565/10
Amir Shehata [Thu, 6 May 2021 06:02:22 +0000 (23:02 -0700)]
LU-14668 lnet: don't delete peer created by Lustre

Peers created by Lustre have their primary NIDs locked.
If that peer is deleted, it'll confuse lustre. So when manually
deleting a peer using:
   lnetctl peer del --prim_nid ...
We must continue to preserve the primary NID. Therefore we delete
all the constituent NIDs, but keep the primary NID. We then
flag the peer for rediscovery.

Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I34eef9b0049435a01fde87dc8263dd50f631c551
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/43565
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
19 months agoLU-14668 lnet: Peers added via kernel API should be permanent 88/43788/7
Chris Horn [Tue, 25 May 2021 16:17:49 +0000 (11:17 -0500)]
LU-14668 lnet: Peers added via kernel API should be permanent

The LNetAddPeer() API allows Lustre to predefine the Peer for LNet.
Originally these peers would be temporary and potentially re-created
via discovery. Instead, let's make these peers permanent. This allows
Lustre to dictate the primary NID of the peer. LNet makes sure this
primary NID is not changed afterwards.

Test-Parameters: trivial
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I3f54c04719c9e0374176682af08183f0c93ef737
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/43788
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
19 months agoLU-14668 lnet: Lock primary NID logic 06/50106/4
Amir Shehata [Wed, 5 May 2021 18:35:06 +0000 (11:35 -0700)]
LU-14668 lnet: Lock primary NID logic

If a peer is created by Lustre make sure to lock that peer's
primary NID. This peer can be discovered in the background.
There is no need to block until discovery is complete, as Lustre
can continue on with the primary NID it provided.

Discovery will populate the peer with other interfaces the peer has
but will not change the peer's primary NID. It can also delete
peer's NIDs which Lustre told it about (not the Primary NID).

If a peer has been manually discovered via
   lnetctl discover <nid>
command, then make sure to delete the manually discovered
peer and recreate it with the Lustre NID information
provided for us.

Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I8fc8a69caccca047e3085bb33d026a3f09fb359b
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50106
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
19 months agoLU-13743 build: Explicitly require encryption support 43/39243/10
Shaun Tancheff [Thu, 2 Feb 2023 03:47:15 +0000 (21:47 -0600)]
LU-13743 build: Explicitly require encryption support

Linux commit v5.18-rc5-17-gb1241c8eb977
  ext4: move ext4 crypto code to its own file crypto.c

Update the ldiskfs Makefile to exclude crypto.c when
CONFIG_FS_ENCRYPTION is not enabled.

Test-Parameters: trivial
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Ic8a40f3d395286bb52ed20693fd7cc4755b10556
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/39243
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: jsimmons <jsimmons@infradead.org>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
19 months agoLU-16509 lnet: quash memcpy WARN_ONCE false positive 01/49801/7
Shaun Tancheff [Fri, 27 Jan 2023 07:08:23 +0000 (01:08 -0600)]
LU-16509 lnet: quash memcpy WARN_ONCE false positive

Linux v6.1-rc1-4-g6f7630b1b5bc
  fortify: Capture __bos() results in const temp var

In lnet_peer_push_event() the memcpy triggers a WARN_ONCE
due to the flexible array at the end of
struct lnet_ping_info contained in struct lnet_ping_buffer

Use unsafe_memcpy() to avoid this false positive warning.

Test-Parameters: trivial
HPE-bug-id: LUS-11455
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I4aa8f38678cd1522004d98b58a3f440d8a38589c
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49801
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
19 months agoLU-16571 utils: fix parallel "lfs migrate -b" on hard links 13/50113/4
Etienne AUJAMES [Wed, 22 Feb 2023 10:37:49 +0000 (11:37 +0100)]
LU-16571 utils: fix parallel "lfs migrate -b" on hard links

Multiple blocking "lfs migrate" on the same file can exhaust "ost"
service threads of an OSS CPT.

llapi_get_data_version(...,LL_DV_RD_FLUSH) causes the OSS server to
take a server-side extent lock PR to force clients with write lock to
update the data version of the object.

migrate_block() (lfs.c) checks the file data version is check with
LL_DV_RD_FLUSH before taking the group lock.
So "ofd_getattr_hdl()" server side lock will conflict with the lfs
instance that has the group lock.
Each attempt to get server-side extent lock will take an "ost" service
thread slot waiting the group lock to be released.

If all threads of the "ost" servive are exhausted on a CPT, the OSS
can not handle requests from the client and it will get queued inside
the NRS policy. This causes the lfs process with the group lock to
hang (pread needs "ost" service to get sizes of objects).

This patch check the file data version inside the group lock without
LL_DV_RD_FLUSH. This flag is not needed, the client already has an
extent group lock on all the OST objects.

Add the regression test sanity 56xj.

Test-Parameters: testlist=sanity env=ONLY=56xj,ONLY_REPEAT=20
Test-Parameters: testlist=sanity env=ONLY=56
Test-Parameters: testlist=sanity env=ONLY=56
Test-Parameters: testlist=sanity env=ONLY=56
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Change-Id: I0bacd372dd6f36a4ac776133dff45dc836c7c7f7
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50113
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
20 months agoLU-16585 build: remove python2 dependencies 84/50084/2
Alex Deiter [Tue, 21 Feb 2023 22:27:47 +0000 (02:27 +0400)]
LU-16585 build: remove python2 dependencies

Fixed packaging issue casued by scripts and control files.

Test-Parameters: trivial
Signed-off-by: Alex Deiter <alex.deiter@gmail.com>
Change-Id: I6c9b24bf811269928494af17c15627902e5fe27b
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50084
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Feng Lei <flei@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-16557 client: -o network needs add_conn processing 86/49986/6
Mikhail Pershin [Mon, 13 Feb 2023 09:07:45 +0000 (12:07 +0300)]
LU-16557 client: -o network needs add_conn processing

Mount option -o network restricts client import to use
only selected network. It processes connection UUID/NIDs
during 'setup' config command handling but skips any
'add_conn' command if its UUID has no mention about that
network. Meahwhile connection UUID is just a name and may
have many NIDs configured including those on restricted
network which are skipped as well. Therefore client import
configuration misses failover NIDs on restricted network.

Patch makes import to save restricted network information
after 'setup' command processing, so it is applied to any
client_import_add_conn() call. The 'add_conn' command is
always processed now and its NIDs will be filtered in the
same way as for 'setup'.
Test 31 in sanity-sec.sh is extended to check imports
failover_nids has all and only NIDs on restricted network

Test-Parameters: env=ONLY=31 testlist=sanity-sec
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: Id70ebd836f061f154e3779b07b52f1baea9a1776
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49986
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-16568 lfs: call Parser_exit() at lfs/lctl exit 39/50039/6
Arshad Hussain [Fri, 17 Feb 2023 03:38:29 +0000 (22:38 -0500)]
LU-16568 lfs: call Parser_exit() at lfs/lctl exit

Call Parser_exit() before lfs and lctl cleanly exit
to free memory allocated in Parser_init() via strdup().

Test-Parameters: trivial fstype=zfs testlist=sanity
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I1bc86d4b17f62a545e51fb3e479b2576e6362c42
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50039
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: jsimmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
20 months agoLU-16553 utils: cleanup lfs options 92/49992/2
Timothy Day [Wed, 8 Feb 2023 17:27:16 +0000 (17:27 +0000)]
LU-16553 utils: cleanup lfs options

The enums for lfs long options should start
after the char range. This is supported
by getopt_long and used by lustre/test/statx.c.
I didn't see this issue in any of the other
uses of getopt_long, so this is the only place
this fix needs to be made.

Also, enable stats when stats_interval is used.

Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I9144b2d68291811edf95b9d912fbbb8fe0266392
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49992
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-16518 misc: use fixed hash code 16/49916/5
Timothy Day [Mon, 6 Feb 2023 20:02:15 +0000 (20:02 +0000)]
LU-16518 misc: use fixed hash code

There is a configure check to avoid using
broken hashing code. All calls to 'hash_*'
are replace by the 'cfs_hash_*' equivalents,
to make use of this check.

Two functions which hash then apply a mask
are removed. The calls are replaced with
'cfs_hash_*' and manually applying the mask.

Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Ia4b27cb6fb1329b9df45c00f748a8d22178b0654
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49916
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: jsimmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
20 months agoLU-16481 build: add server support for openEuler 52/49652/5
Xinliang Liu [Mon, 21 Nov 2022 03:36:38 +0000 (03:36 +0000)]
LU-16481 build: add server support for openEuler

openEuer uses dnf as rpm pkg manager, it is somewhat like RHEL/Fedora.
The current openEuler LTS 22.03 kernel is based on Linux 5.10.0.

Ldiskfs patches based on ldiskfs-5.10.0-ml.series, different patches
compared with ldiskfs-5.10.0-ml.series are:
oe2203/ext4-misc.patch
oe2203/ext4-pdirop.patch
  use due to openEuler kernel backport new bugfixes and
  based on ldiskfs-5.14.21-sles15sp4.series
linux-5.16/ext4-inode-version.patch
ubuntu20.04.3/ext4-simple-blockalloc.patch
linux-5.14/ext4-xattr-disable-credits-check.patch
  use due to openEuler kernel backport new bugfixes.

This patch also fixes lbuild that no need a kernel config file for
patchless-server build. And add patched-server build needs an series
patches checking.

Test notes
----------
This patch is tested with below lbuild cmd:
../lustre-release/contrib/lbuild/lbuild --ccache
  --kerneldir=/home/openeuler/kernel-src-rpm/
  --kernelrpm=/home/openeuler/kernel-src-rpm/
  --lustre=/home/openeuler/lustre-release/lustre-2.15.54_1_xxx.tar.gz
  --patchless-server --disable-zfs
Note that, due to zfs openEuler build support patches[1] haven't been
backported to the stable release branch zfs-2.1-release and tag 2.1.5,
current lbuild doesn't support zfs rpms build for openEuler you need
to build zfs rpms in the zfs source code individually with cmd 'make
rpms'.
And until the openEuler gcc issue[2] is fixed, or you need to apply
Lustre rpm spec patch[3].
Until the openEuler kernel symbols providing issue[4] is fixed, or you
need to install kmod rpms with cmd 'sudo rpm -ivh --nodeps
./*.aarch64.rpm '
[1] https://github.com/openzfs/zfs/pulls?q=is%3Apr+is%3Aclosed+openeuler
[2] https://gitee.com/openeuler/gcc/issues/I5XMD0
[3] diff lustre.spec.in
...
-%define optflags -g -O2 -Werror
+%define optflags -g -O2 -Werror -Wno-stringop-overflow
[4] https://gitee.com/src-openeuler/kernel/issues/I6DQDX

Test-Parameters: trivial
Change-Id: Ie00e7d37ba3965e409b924109085a675bf3f7f4f
Signed-off-by: Xinliang Liu <xinliang.liu@linaro.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49652
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-15971 llite: match lock in corresponding namespace 43/47843/10
Lai Siyao [Wed, 29 Jun 2022 15:51:47 +0000 (11:51 -0400)]
LU-15971 llite: match lock in corresponding namespace

For remote object, LOOKUP lock is on parent MDT, so lmv_lock_match()
iterates all MDT namespaces to match locks. This is needed in places
where only LOOKUP ibit is matched, and the lock namespace is unknown.

Test-Parameters: mdscount=2 mdtcount=4 testlist=sanityn env=ONLY=109,ONLY_REPEAT=50
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I32767f63f92a4df825ddacbf435bda8cfcfbbdd7
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/47843
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-12275 sec: remove bio functions in fscrypt compat 23/50023/4
Andreas Dilger [Thu, 16 Feb 2023 16:21:50 +0000 (09:21 -0700)]
LU-12275 sec: remove bio functions in fscrypt compat

Remove libcfs/llibcfs/crypto/bio.c since direct block device access
is not needed for client builds, and the use of stuct bio on the
client adds unnecessary complexity to portability.

Test-Parameters: trivial
Fixes: a813e8187 ("LU-12275 sec: add llcrypt as file encryption library")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I97642dfd85053b9ea4196374f2002ffb6a2540e5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50023
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-16558 mdt: Fix max limit for "max_mod_rpcs_in_flight" 10/50010/8
Vitaliy Kuznetsov [Sat, 18 Feb 2023 08:28:37 +0000 (11:28 +0300)]
LU-16558 mdt: Fix max limit for "max_mod_rpcs_in_flight"

This minor fix fixes a bug with the definition of the maximum
limit for the mdt_max_mod_rpcs_in_flight parameter when it is
changed.

Fixes: f16c31ccd9 ("LU-16454 mdt: Add a per-MDT "max_mod_rpcs_in_flight")
Signed-off-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Change-Id: I5576da1dbcaa0b4202af4b02023a46991f443a4b
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50010
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-15728 mdd: fix sanity-56oc failure 09/50009/5
Aurelien Degremont [Wed, 15 Feb 2023 10:42:25 +0000 (10:42 +0000)]
LU-15728 mdd: fix sanity-56oc failure

sanity 56oc starts failing since relatime patch was landed.
'relatime' patch introduced an atime behavior change.  It was forcing
atime uptime to disk on MDD if ondisk atime is older than ondisk
mtime/ctime to match relatime (even if relatime was not enabled).

This was an optimization, trying to have a slightly better atime value
cheaply. This is unfortunately causing regression in sanity-56oc.
Let's remove it for now until we understand that better.

Fixes: c10c6ee ("LU-15728 llite: fix relatime support")
Test-Parameters: testlist=sanity env=ONLY=56oc,ONLY_REPEAT=70
Change-Id: Ieed4d4c7523c26cfc5bc230986d96b2acf152dee
Signed-off-by: Aurelien Degremont <degremoa@amazon.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50009
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-16555 obdclass: print more special chars for jobid 98/49998/4
Lei Feng [Wed, 15 Feb 2023 03:56:36 +0000 (11:56 +0800)]
LU-16555 obdclass: print more special chars for jobid

Print more YAML compatible special chars for jobid.
Currently they are any of ".@-_:/".

Test-Parameters: trivial
Fixes: 338381574b ("LU-11407 tgt: cleanup job_stats output printing")
Signed-off-by: Lei Feng <flei@whamcloud.com>
Change-Id: Ic3272b73c95b76ad3171cc7a368a18f804b9aa3e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49998
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
20 months agoLU-16551 tests: Ensure all peer credits used in MR 79/49979/2
Chris Horn [Mon, 13 Feb 2023 21:15:31 +0000 (15:15 -0600)]
LU-16551 tests: Ensure all peer credits used in MR

sanity-lnet test_254 needs to ensure that all peer credits are
consumed. Because of the raciness of the round robin code in LNet,
we cannot rely on just issuing the appropriate number of pings.
Instead we should use the --source argument to lnetctl ping to ensure
that we send the correct number of pings over each interface.

To simplify matters, only perform this test, and the other tests that
call setup_health_test(), in non-routed configurations.

Test-Parameters: trivial testlist=sanity-lnet
Fixes: 52db11cdce ("LU-16303 lnet: Drop LNet message if deadline exceeded")
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I05a7ffec37d16c14711fe696232708f927357b1c
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49979
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-16544 kernel: kernel update RHEL 7.9 [3.10.0-1160.83.1.el7] 69/49969/2
Jian Yu [Sat, 11 Feb 2023 00:09:13 +0000 (16:09 -0800)]
LU-16544 kernel: kernel update RHEL 7.9 [3.10.0-1160.83.1.el7]

Update RHEL 7.9 kernel to 3.10.0-1160.83.1.el7.

Test-Parameters: trivial clientdistro=el7.9 serverdistro=el7.9

Signed-off-by: Jian Yu <yujian@whamcloud.com>
Change-Id: I8f2e67c27e80e7ac852fc367f0bd83f2c91016ce
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49969
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-16486 kernel: kernel update RHEL8.7 [4.18.0-425.10.1.el8_7] 83/49683/3
Jian Yu [Fri, 10 Feb 2023 23:48:01 +0000 (15:48 -0800)]
LU-16486 kernel: kernel update RHEL8.7 [4.18.0-425.10.1.el8_7]

Update RHEL8.7 kernel to 4.18.0-425.10.1.el8_7.

Test-Parameters: trivial fstype=ldiskfs \
clientdistro=el8.7 serverdistro=el8.7 testlist=sanity

Test-Parameters: trivial fstype=zfs \
clientdistro=el8.7 serverdistro=el8.7 testlist=sanity

Change-Id: I5759d0cb06a1148689ed9b8c947cb6516ab3aca1
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49683
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-16382 spec: improve Summary and description. 67/49367/6
Mr NeilBrown [Mon, 12 Dec 2022 04:47:31 +0000 (15:47 +1100)]
LU-16382 spec: improve Summary and description.

Summary should not repeat the name of the package, and should avoid
being overly long.  It should start with CAPS and not end with period.

Description should be more detailed than the summary.

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Ifcdb881f649200e1857c707fb2c589579e4035a4
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49367
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: jsimmons <jsimmons@infradead.org>
20 months agoLU-16463 llite: replace lld_nfs_dentry flag with opencache handling 37/49237/11
James Simmons [Tue, 7 Feb 2023 15:08:13 +0000 (10:08 -0500)]
LU-16463 llite: replace lld_nfs_dentry flag with opencache handling

The lld_nfs_dentry flag was created for the case of caching the
open lock (opencache) when fetching fhandles for NFSv3. This same
path is used by the fhandle APIs. This lighter open changes key
behaviors since the open lock is always cached which we don't
want. Lustre introduced a way to modify caching the open lock
based on the number of opens done on a file within a certain
span of time. We can replace lld_nfs_dentry flag with the
new open lock caching. This way for fhandle handling we match
the open lock caching behavior of a normal file open.

In the case of NFS this code path will always be called with the
internal kernel thread 'nfsd'. If we are called by this kernel
thread set the open threshold to zero which means always cache the
open lock. Once Lustre is only supported on Linux kernels above
5.5 we can remove this special NFSv3 work around.

Change-Id: Iba27f7ad4579fdd1f34e1e35c2cbd547e15f129a
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49237
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-16523 lprocfs: adjust the format of rename_stats 69/49869/3
Lei Feng [Thu, 2 Feb 2023 01:39:03 +0000 (09:39 +0800)]
LU-16523 lprocfs: adjust the format of rename_stats

Adjust the format of rename_stats to a more human-friendly YAML.

Fixes: bedb797c5d ("LU-16110 lprocfs: make job_stats and rename_stats valid YAML")
Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: trivial
Change-Id: I20e6d07c974e907bb2e30412dd1899f845de2021
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49869
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
20 months agoLU-15420 sec: handle simple fscrypt changes for 5.15 kernels 25/49125/22
James Simmons [Wed, 8 Feb 2023 13:04:45 +0000 (08:04 -0500)]
LU-15420 sec: handle simple fscrypt changes for 5.15 kernels

The patch covers the low impact changes to the Linux kernels
fscrypt API. The changes are:

For Linux kernel version 5.9-rc4 the following commits are:

8b10fe68985278de4926daa56ad6af701839e40a removed the inode
parameter for the fscrypt function fscrypt_fname_alloc_buffer()

5b2a828b98ec1872799b1b4d82113c76a12d594f ended up exporting
fscrypt_d_revalidate() and stopped stomping on the d_ops.

c8c868abc91ff23f6f5c4444c419de7c277d77e1 changed
fscrypt_set_test_dummy_encryption() take a 'const char *

ac4acb1f4b2b6b7e8d913537cccec8789903e164 moved the fscrypt core
from using fscrypt_context to using fscrypt_policy that is user
forward facing.

Lastly for Linux kernel version 5.10-rc4 the commit

ec0caa974cd092549ab282deb8ec7ea73b36eba0 stopped exporting
fscrypt_get_encryption_info(). Use fscrypt_prepare_readdir()
in its place.

70fb2612aab62d47e03f82eaa7384a8d30ca175d renamed a field in
struct fscrypt_name. Since Lustre can't use fscrypt_prepare_lookup
we have to deal with this change.

Remove sptlrpc_enc_pool_del_user() since its an empty function
that waste cycles calling it.

The other large change was the replacement of
fscrypt_inherit_context which is described in Linux commit
a992b20cd4ee360dbbe6f69339cb07146e4304d6. This change is very
large since it expects the target inode to be available. Lustre
uses fscrypt_inherit_context before the inode is available so
this is a much more complex change that will be done in another
patch.

de3cdc6e75179a2324c23400b21483a1372c95e1 makes fscrypt_require_key
private. You need to test the key's presence with
fscrypt_has_encryption_key() instead but that key needs to be
setup first by the new function fscrypt_prepare_new_inode().
Since this is the case we wait to introduce this change.

Change-Id: I4bed7fef6e3302c0258c0f1563f4e180258d7a5a
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49125
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-16101 tests: add sanity/27J to always_except 70/49970/3
Jian Yu [Sun, 12 Feb 2023 07:48:30 +0000 (23:48 -0800)]
LU-16101 tests: add sanity/27J to always_except

This patch adds sanity/27J to always_except for SLES15 SP4
and 5.16.0+ kernels before the issue introduced by upstream
commit 8c8387ee3f55
("mm: stop filemap_read() from grabbing a superfluous page")
is resolved.

Test-Parameters: trivial clientdistro=sles15sp4 testlist=sanity

Signed-off-by: Jian Yu <yujian@whamcloud.com>
Change-Id: Iafde656530fcdc1de9265aacaa9266435c9d5c47
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49970
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Xing Huang <hxing@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-14719 lod: ignore space check error in recovery 49/49249/6
Lai Siyao [Thu, 24 Nov 2022 21:51:58 +0000 (16:51 -0500)]
LU-14719 lod: ignore space check error in recovery

statfs may fail in recovery, ignore this error in
lod_trans_space_check().

Fix syntax error in replay-single 111g version check.

Fixes: 6aee406c84 ("LU-14719 lod: distributed transaction check space")
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I6c7934ca242a639d996d0ab5a4d7648cec8a53de
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49249
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
20 months agoLU-15280 llog: fix processing of a wrapped catalog 08/45708/24
Etienne AUJAMES [Wed, 1 Dec 2021 23:22:33 +0000 (00:22 +0100)]
LU-15280 llog: fix processing of a wrapped catalog

Several issues were found with "lfs changelog --follow" for a wrapped
catalog (llog_cat_process() with startidx):

1/ incorrect lpcd_first_idx value for a wrapped catalog (startcat>0)

The first llog index to process is "lpcd_first_idx + 1". The startidx
represents the last record index processed for a llog plain. The
catalog index of this llog is startcat.
lpcd_first_idx of a catalog should be set to "startcat - 1"
e.g:
llog_cat_process(... startcat=10, startidx=101) means that the
processing will start with the llog plain at the index 10 of the
catalog. And the first record to process will be at index 102.

2/ startidx is not reset for an incorrect startcat index

startidx is relevant only for a startcat. So if the corresponding llog
plain is removed or if startcat is out of range, we need to reset
startidx.

This patch remove LLOG_CAT_FIRST, that was really confusing
(LU-14158). And update osp_sync_thread() with the
llog_cat_process() corrected behavior.

It modifies also llog_cat_retain_cb() to zap empty plain llog directly
in it (like for llog_cat_size_cb()), the current implementation is not
compatible with this patch.

The test "conf-sanity 135" verify "lfs changelog --follow" for a
wrapped changelog_catalog.

Test-Parameters: testlist=conf-sanity env=ONLY=135,ONLY_REPEAT=10
Test-Parameters: testlist=sanity env=ONLY=60a,ONLY_REPEAT=20
Test-Parameters: testlist=conf-sanity env=SLOW=yes,ONLY=106,ONLY_REPEAT=10
Fixes: a4f049b9 ("LU-13102 llog: fix processing of a wrapped catalog")
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Change-Id: Iaf46ddd4a6ec1e06cec0d17aa9bde766bd793abc
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/45708
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
20 months agoLU-11912 fid: clean up OBIF_MAX_OID and IDIF_MAX_OID 59/45659/17
Li Dongyang [Tue, 23 Nov 2021 23:45:48 +0000 (10:45 +1100)]
LU-11912 fid: clean up OBIF_MAX_OID and IDIF_MAX_OID

Define the OBIF|IDIF_MAX_OID macros to 1ULL << OBIF|IDIF_MAX_BITS - 1
Clean up the callers and remove OBIF|IDIF_OID_MASK which are not used.

Test-Parameters: trivial
Change-Id: I9a679b930c73da5904b2eb4c74f785fc1d27a8a0
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/45659
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sergey Cheremencev <scherementsev@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-16536 osp: don't cleanup ldlm in precleanup phase 25/49925/6
Alex Zhuravlev [Tue, 7 Feb 2023 09:29:24 +0000 (12:29 +0300)]
LU-16536 osp: don't cleanup ldlm in precleanup phase

instead do this in cleanup phase so that all OSPs have chance
to abort in-flight RPCs which can block MDT thread holding
LDLM locks.

Fixes: 226fd401f9 ("LU-7660 dne: support fs default stripe")
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Ib3714b29c514a7fa938d47717dc36525654407d6
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49925
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-16493 tests: recovery-small/144b to wait longer 93/49693/3
Alex Zhuravlev [Thu, 19 Jan 2023 14:14:21 +0000 (17:14 +0300)]
LU-16493 tests: recovery-small/144b to wait longer

when ZFS is used as ZFS's primitives like osd_write()
and osd_declare_write() are still few times slower
compared to ldiskfs and this 144b creating wide-striped
files which causes lots of tiny writes to the last-used
file.

Test-Parameters: trivial testlist=recovery-small
Test-Parameters: testlist=recovery-small fstype=zfs env=ONLY=144b,ONLY_REPEAT=100
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Ib023c7c71a3bb486f9f9908e4cc03cc6e53ace7a
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49693
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
20 months agoLU-16515 clio: Remove cl_page_size() 18/49918/2
Patrick Farrell [Mon, 6 Feb 2023 23:22:24 +0000 (18:22 -0500)]
LU-16515 clio: Remove cl_page_size()

cl_page_size() is just a function which does:
1 << PAGE_SHIFT

and the kernel provides a macro for that - PAGE_SIZE.
Maybe it didn't when this function was added, but it sure
does now.

So, remove cl_page_size().

Test-parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I6c27f6db7cfec5d9054aab95beccffe3c2da02bb
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49918
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-16532 sec: session key bad keyring 09/49909/2
Sebastien Buisson [Thu, 12 Jan 2023 09:18:48 +0000 (10:18 +0100)]
LU-16532 sec: session key bad keyring

At initialization, the session key created for GSS context is linked
to a keyring from userpace, and then unlinked from kernelspace if it
is for root, as we want to share it across all root sessions.
Sometimes initialization fails (expired token, unresponsive server,
etc.) so the key cannot be unlinked. Survive this use case gracefully.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I8ad2afa0e51e50640620e36211e5db1253d85e08
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49909
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-16520 build: Move strscpy to libcfs common header 63/49863/3
Shaun Tancheff [Fri, 3 Feb 2023 08:17:16 +0000 (02:17 -0600)]
LU-16520 build: Move strscpy to libcfs common header

Ensure strscpy is available to lustre

Test-Parameters: trivial
Fixes: 0b406c91d17 ("LU-13642 lnet: modify lnet_inetdev to work with large NIDS")
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I0c3673c2aa7e6b61671521a8cabde8a364f7f6f8
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49863
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: jsimmons <jsimmons@infradead.org>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-16502 lutf: cleanup lutf_start.py, fix bugs 67/49767/3
Timothy Day [Sat, 21 Jan 2023 01:45:05 +0000 (01:45 +0000)]
LU-16502 lutf: cleanup lutf_start.py, fix bugs

Remove code duplication by adding __check_env_var
method. Use os.environ.get to remove needlessly
verbose try-except blocks. Use __cfg_yaml more,
rather than passing this value explicitly
between methods.

Add check for environment variables, so LUTF
can fail gracefully if the environment is not
set correctly.

Update .gitignore to ignore __pycache__

Fix syntax error in python/infra/lutf.py

Test-Parameters: @lnet
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I16cf114ac4253d22a42399e2f2cb2fad49dd96cb
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49767
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-16502 lutf: fix bugs in bash scripts 28/49728/4
Timothy Day [Sat, 21 Jan 2023 01:40:13 +0000 (01:40 +0000)]
LU-16502 lutf: fix bugs in bash scripts

Addressed some issues I found when running
LUTF. The "rm" fails without a file to remove.
A Lustre wiki led me to source the script, but this
will log out of the current shell. Adding a warning
against doing this.

Also, fix shellcheck errors and a few warnings in
LUTF related scripts. Added bash shebangs, since
shellcheck requires these to lint the scripts.

Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I501d58d25bfcd6564755485b9a1afa2277848b96
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49728
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
20 months agoLU-16494 fileset: check fileset for operations by fid 96/49696/8
Sebastien Buisson [Thu, 19 Jan 2023 17:07:27 +0000 (18:07 +0100)]
LU-16494 fileset: check fileset for operations by fid

Some operations by FID, such as lfs rmfid, must be aware of
subdirectory mount (fileset) so that they do not operate on files
that are outside of the namespace currently mounted by the client.

For lfs rmfid, we first proceed to a fid2path resolution. As fid2path
is already fileset aware, it fails if a file or a link to a file is
outside of the subdirectory mount. So we carry on with rmfid only
for FIDs for which the file and all links do appear under the
current fileset.

This new behavior is enabled as soon as we detect a subdirectory mount
is done (either directly or imposed by a nodemap fileset). This means
the new behavior does not impact normal, whole-namespace client mount.

sanity test_421h is added to exercise this new capability.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I47136ac0a3324b9afdd01b0f902abc37938bd361
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49696
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: jsimmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-16479 utils: Add option to manage degraded ZFS OST 60/49660/6
Akash B [Tue, 17 Jan 2023 15:51:19 +0000 (10:51 -0500)]
LU-16479 utils: Add option to manage degraded ZFS OST

Add new Lustre specific ZFS dataset user property to
control/manage degraded ZFS OSTs, also modify the existing
lustre/scripts/statechange-lustre.sh zedlet accordingly.
Extend the same to mkfs.lustre utility to add this property
by default when creating a new Lustre ZFS server.

HPE-bug-id: LUS-11447
Test-Parameters: trivial fstype=zfs testlist=sanity
Signed-off-by: Akash B <akash-b@hpe.com>
Change-Id: I7032538f507c9ad20d5b109b54e3c3bab8138458
Reviewed-by: Dipak Ghosh <dipak.ghosh@hpe.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49660
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Siddarth Raj <siddarth.raj@hpe.com>
Reviewed-by: Nikitas Angelinas <nikitas.angelinas@hpe.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
20 months agoLU-16428 tests: cache is_project_quota_supported result 99/49499/3
Andreas Dilger [Fri, 23 Dec 2022 00:28:14 +0000 (17:28 -0700)]
LU-16428 tests: cache is_project_quota_supported result

Rather than is_project_quota_supported() repeatedly checking if the
MDS supports project quota, check the MDS once after mount and then
cache it for the rest of the test run.

In resetquota() there is no need to check is_project_quota_supported()
each time, since this is only called with "-p" when project quota was
previously checked and is enabled.  Also, there is no need to wait 1s
after every quota limit is reset, but rather only once at the end.

Test-Parameters: trivial testlist=sanity-quota mdtcount=4 mdscount=2
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Id98240e2fcd2e862bb4305961a2946227e3ebbe5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49499
Reviewed-by: Sergey Cheremencev <scherementsev@ddn.com>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
20 months agoLU-16382 spec: use pkgconfig() as appropriate. 66/49366/4
Mr NeilBrown [Mon, 12 Dec 2022 04:42:11 +0000 (15:42 +1100)]
LU-16382 spec: use pkgconfig() as appropriate.

pkgconfig() is preferred over explicit dependencies on libfoo-devel.
For SUSE, this is particularly needed for system and with SLE15-SP4
the OBS automatically includes "systemd-mini", but that conflicts with
"system".  Usig "pkgconfig(systemd) resolves this.

Also there is no need to have "Depends" for library packages.  These
are determined automatically from the result of the build.

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Idfa4fdfd8bf060175b64a1991d3367024a368344
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49366
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Aurelien Degremont <degremoa@amazon.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: jsimmons <jsimmons@infradead.org>
20 months agoLU-16382 spec: Don't include Group: tags. 64/49364/4
Mr NeilBrown [Mon, 12 Dec 2022 04:25:31 +0000 (15:25 +1100)]
LU-16382 spec: Don't include Group: tags.

Fedora Project has deprecated group tags:
  https://fedoraproject.org/wiki/RPMGroups
  https://docs.fedoraproject.org/en-US/packaging-guidelines/#_tags_and_sections

The groups tags currently used are not recognised by SUSE.

So remove all the Group: tags - except one.

The %kernel_module_package macro for SUSE requires that a group
be given.

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I6b7222200ea1a02319a703d64542dfb9780c048a
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49364
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: jsimmons <jsimmons@infradead.org>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-14224 misc: add firewalld service configuration 21/41021/3
Andreas Dilger [Wed, 7 Apr 2021 19:37:42 +0000 (12:37 -0700)]
LU-14224 misc: add firewalld service configuration

RHEL8 ships with restrictive firewalld rules out of the box.
This prevents servers and clients from connecting to each other.
Add a lustre.xml service file for firewalld, so that it is easy
to run a command like:

    firewall-cmd --permanent --zone=public --add-service=lustre

to add the Lustre service ports with minimal difficulty.

It would be good if this was run automatically when the RPMs are
installed, or when mount.lustre is run, but it isn't clear what
is good/safe/correct in all cases. At least having the service
file will be a starting point to make this easier for admins.

It would be even better if the Lustre service rules were restricted
to accepting only new connections, and clients would only accept
requests from the MGS initially and then dynamically add ports for
servers as they are configured, but this is beyond my firewalld-fu.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I9f49d4b0df1c9fb6b343df81f966d9110c300c1e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/41021
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-14111 obdclass: count eviction per obd_device 28/40528/15
Aurelien Degremont [Tue, 13 Oct 2020 14:12:23 +0000 (14:12 +0000)]
LU-14111 obdclass: count eviction per obd_device

Add a new 'obd_eviction_count' counter to obd_device which
is increased every time a client is evicted, which means
every time we call `class_fail_export()`.

Expose this counter through `lctl get_param *.*.eviction_count`
for every target.

Signed-off-by: Aurelien Degremont <degremoa@amazon.com>
Change-Id: I83b691662285cf2cd937187bffa54de6bd1f694c
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/40528
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
20 months agoLU-16501 tgt: skip free inodes in OST weights 90/49890/4
Andreas Dilger [Fri, 3 Feb 2023 10:14:39 +0000 (03:14 -0700)]
LU-16501 tgt: skip free inodes in OST weights

In lu_tgt_qos_weight_calc() calculate the target weight consistently
with how the per-OST and per-OSS penalty calculation is done in
ltd_qos_penalties_calc().  Otherwise, the QOS weighting calculations
combine two different units, which incorrectly weighs allocations on
OST with more free inodes over those with more free space.

Fixes: d3090bb2b486 ("LU-11213 lod: share object alloc QoS code with LMV")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I1ccc52d7ad5dc440ae48403ba129efd6a0a51c33
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49890
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Sergey Cheremencev <scherementsev@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-6142 lov: use list_for_each_entry in lov_obd.c 40/49740/3
Mr. NeilBrown [Mon, 23 Jan 2023 21:58:40 +0000 (16:58 -0500)]
LU-6142 lov: use list_for_each_entry in lov_obd.c

Using the *_entry macro simplifies the code slightly.

Change-Id: Ia50cd2cbaf9bac6c9873ef4af7e5cbfa2c8e660e
Signed-off-by: Mr. NeilBrown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49740
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-14918 osd: don't declare similar zfs writes twice 01/49701/7
Alex Zhuravlev [Thu, 19 Jan 2023 19:06:36 +0000 (22:06 +0300)]
LU-14918 osd: don't declare similar zfs writes twice

in some cases (like overstriping) the same operations can be
declared multiple times (new llog records) and this lead to
huge number of credits and performance degradation. we can
avoid this checking for duplicate declarations.
notice each declare operation results in a allocation in ZFS.

the example for an overstriped file (2000 stripes over 4 OSTs),
declare ops before after
create: 2001 2
unlink: 10001 10

creation of 1K-stripe files (over 4 OSTs) is 2.5% faster.
removal of 1K-stripe files is 44% faster.

single-stripe file creation/removal does not degrade.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I5d9e6d3a1574ccd7bf97fd3a67ab4fff0b6a352c
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49701
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-14918 osd: don't declare similar ldiskfs writes twice 65/45765/38
Alex Zhuravlev [Tue, 7 Dec 2021 08:13:54 +0000 (11:13 +0300)]
LU-14918 osd: don't declare similar ldiskfs writes twice

in some cases (like overstriping) the same operations can be
declared multiple times (new llog records) and this lead to
huge number of credits and performance degradation. we can
avoid this checking for duplicate declarations.
As every declaration would need an allocation, limit the scope
of this checks to transaction likely to be large.

% of "large" transaction in sanity-benchmark, depending on threshold:

  creates < 5 && writes < 5:
  0.58% (mds1) and  2.97% (mds2)

  create < 7 & writes < 7:
  0.58% and 2.4%

  create < 9 & writes  < 9:
  0.6% and 1.85%

  create < 10 & write2 < 10:
  0.0004% and 0.000001%

thus 10 creates or writes is selected as a threshold to enable this
logic.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I7c893fe3b95646b4b813b999bc832659dfcf03ad
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/45765
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
20 months agoLU-16454 mdt: Add a per-MDT "max_mod_rpcs_in_flight" 49/49749/16
Vitaliy Kuznetsov [Wed, 8 Feb 2023 21:34:38 +0000 (00:34 +0300)]
LU-16454 mdt: Add a per-MDT "max_mod_rpcs_in_flight"

Value max_mod_rpcs_per_client doesn't define a static number of
slots for the per-client replies or anything, and the only
thing it is used for is to pass the limit to the client. For the
same reason, there also doesn't appear to be a particularly hard
limitation why the client cannot change and exceed the
server-provided parameter, except to avoid overloading the server
with too many RPCs at once, but that may also be true of the
current limit with a larger number of clients, no different than
"max_rpcs_in_flight".

This fix adds a tunable parameter "max_mod_rpcs_in_flight" per MDT
to lustre/mdt/mdt_lproc.c so that it can be set
with "lctl set_param" at runtime. The max_mod_rpcs_per_client global
setting is marked "deprecated" but is still used as the default
value when creating an MDT.

Signed-off-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Change-Id: I27cfcb68e1a534e80e6a2dbf2e1affc430803b49
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49749
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: jsimmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
20 months agoNew tag 2.15.54 2.15.54 v2_15_54
Oleg Drokin [Thu, 9 Feb 2023 17:38:57 +0000 (12:38 -0500)]
New tag 2.15.54

Change-Id: I592cabccefa9bbdf3d1d97fa313103b8b1b1eb3b
Signed-off-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-16500 utils: set default ost index for lfs migrate 19/49819/3
Jian Yu [Wed, 1 Feb 2023 07:11:56 +0000 (23:11 -0800)]
LU-16500 utils: set default ost index for lfs migrate

Running "lfs migrate <file>" without any SETSTRIPE arguments
to balance space usage keeps the PFL file layout, but preserves
the OST selection exactly, which makes the migration virtually
useless for space balancing.

This patch fixes the above issue by clearing the specific
OST indices from the source layout before using the layout to
create the volatile file in lfs_migrate().

Signed-off-by: Jian Yu <yujian@whamcloud.com>
Change-Id: I82e1dc0a11fdda7d555df994cf4e5f6e3dbdcb5c
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49819
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Zhenyu Xu <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-930 ptlrpc: clarify AT error message 48/49548/5
Aurelien Degremont [Tue, 18 Jan 2022 13:55:01 +0000 (13:55 +0000)]
LU-930 ptlrpc: clarify AT error message

Clarify the error message related to passed deadline
for AT early replies. It was indicating that the system
was CPU bound which is most of the time wrong, as the issue
is rather communication failure delaying RPC traffic.
This could be confusing to people which will look for
CPU resource consumption where the network traffic is
more at cause.

Also try to use less cryptic keywords which makes only
sense to the feature developer, and not to admins.

Test-Parameters: trivial
Signed-off-by: Aurelien Degremont <degremoa@amazon.com>
Change-Id: Icdff8f4c6fb9905233f6b8ed1b961b2fd1127667
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49548
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-16367 utils: clean up ldiskfs feature handling 16/49316/2
Andreas Dilger [Mon, 5 Dec 2022 18:59:02 +0000 (11:59 -0700)]
LU-16367 utils: clean up ldiskfs feature handling

Update the default ldiskfs features used by mkfs.lustre:
- enable large_dir on OSTs as well as MDTs
- remove obsolete handling of "ext3" filesystems
- clean up handling of other features that have become a bit messy

Test-Parameters: trivial testlist=conf-sanity
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Id717c3ba939ccf9b2de34e868d4415e88429ef39
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49316
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-16221 kernel: new kernel [RHEL 9.1 5.14.0-162.12.1.el9_1] 38/48938/9
Jian Yu [Fri, 27 Jan 2023 20:34:11 +0000 (12:34 -0800)]
LU-16221 kernel: new kernel [RHEL 9.1 5.14.0-162.12.1.el9_1]

This patch makes changes to support new RHEL 9.1 release
for Lustre client.

Test-Parameters: trivial clientdistro=el9.1 \
env=SANITY_EXCEPT="130 244a" testlist=sanity

Change-Id: I8af730f84c9ddf9dcb7e3ddfbd24a68173f51e8d
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48938
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
20 months agoLU-16510 build: fortified memcpy from linux 6.1 11/49811/7
Shaun Tancheff [Mon, 30 Jan 2023 04:17:12 +0000 (22:17 -0600)]
LU-16510 build: fortified memcpy from linux 6.1

The fortified memcpy() from Linux v5.11-11104-ga28a6e860c6c
through v5.18-rc5-1405-g43213daed6d6 incorrectly reports
a false positive out of bounds check.

In function 'memcpy' ...
  '__read_overflow2' declared with attribute error: detected
   read beyond size of object passed as 2nd parameter

Test-Parameters: trivial
HPE-bug-id: LUS-11459
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I3a59d8b647833c05ff4b51e327ed8bce894141fe
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49811
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
20 months agoLU-16292 llite: delete_from_page_cache not exported 69/49069/14
Shaun Tancheff [Thu, 19 Jan 2023 07:38:02 +0000 (01:38 -0600)]
LU-16292 llite: delete_from_page_cache not exported

Linux commit v5.16-rc4-44-g452e9e6992fe
filemap: Add filemap_remove_folio and __filemap_remove_folio

Directly removing a folio/page from the page cache is not
available.

Fallback to generic_error_remove_page for regular files,
and truncate_inode_pages_range as appropriate.

Test-Parameters: trivial
HPE-bug-id: LUS-11198
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I634e7d7719d497ce035a78b424be8e9e8c5a8104
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49069
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: jsimmons <jsimmons@infradead.org>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
20 months agoLU-16188 mdt: fix incompatible HSM request handling 58/48658/6
Aurelien Degremont [Mon, 26 Sep 2022 12:27:37 +0000 (12:27 +0000)]
LU-16188 mdt: fix incompatible HSM request handling

When the coordinator tries to send multiple hsm actions in
a single request, if one of the request fails incompat checks all the
requests are marked as STARTED but none of the requests are
sent to the agent.

Return EAGAIN from mdt_agent_hsm_send() so that the coordinator would
not mark the requests as STARTED. It would retry them later.

Add a sanity-hsm test.

Test-Parameters: trivial testlist=sanity-hsm
Change-Id: Id4fb858021be6dc6b0cbcf140c3f2051efce57ad
Signed-off-by: Jeya Ganesh Babu Jegatheesan <jeyaga@amazon.com>
Signed-off-by: Aurelien Degremont <degremoa@amazon.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48658
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sergey Cheremencev <scherementsev@ddn.com>
Reviewed-by: Nikitas Angelinas <nikitas.angelinas@hpe.com>
Reviewed-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-16118 build: Workaround __write_overflow_field errors 64/48364/23
Shaun Tancheff [Sun, 22 Jan 2023 17:43:29 +0000 (11:43 -0600)]
LU-16118 build: Workaround __write_overflow_field errors

Linux commit v5.17-rc3-1-gf68f2ff91512
   fortify: Detect struct member overflows in memcpy() at compile-time

memcpy and memset of collections of struct members
will trigger:

error: call to â€˜__write_overflow_field’ declared with attribute
   warning: detected write beyond size of field (1st parameter);
   maybe use struct_group()?
   [-Werror] __write_overflow_field(p_size_field, size);

Test-Parameters: trivial
HPE-bug-id: LUS-11194
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Iacd1ab03d1b90ce62b5d7b65e1cd518a5f7981f2
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48364
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: jsimmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
20 months agoLU-16354 ldiskfs: RHEL9.1 server support 83/49283/9
Shaun Tancheff [Sat, 21 Jan 2023 06:16:25 +0000 (00:16 -0600)]
LU-16354 ldiskfs: RHEL9.1 server support

ldiskfs patch series for RHEL9.1

Test-Parameters: trivial
HPE-bug-id: LUS-11332
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Ia0757995ac7200eb50fadf5e106fe1d7b3dc0443
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49283
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
20 months agoLU-16477 ldiskfs: Add ext4-enc-flag patch for RHEL9 35/49635/7
Shaun Tancheff [Fri, 20 Jan 2023 15:27:15 +0000 (09:27 -0600)]
LU-16477 ldiskfs: Add ext4-enc-flag patch for RHEL9

Update ext4-enc-flag for linux 5.14 and include it
the 5.14 based RHEL9 and SUSE 15 SP4 ldiskfs series

Test-Parameters: trivial
HPE-bug-id: LUS-11442
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Iaf4ba914fafe6a9e4ad58b74ae63343bb2918a44
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49635
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: jsimmons <jsimmons@infradead.org>
20 months agoLU-15728 llite: fix relatime support 17/47017/16
Aurelien Degremont [Thu, 7 Apr 2022 12:58:00 +0000 (12:58 +0000)]
LU-15728 llite: fix relatime support

relatime behavior is properly managed by VFS, however
Lustre also stores acmtime on OST objects and atime
updates for OST objects should honor relatime behavior.

This patch updates 'ci_noatime' feature which was introduced to
properly honor noatime option for OST objects, to also support
'relatime'.
file_is_noatime() code already comes from upstream touch_atime().
Add missing parts from touch_atime() to also support relatime.

It also forces atime to disk on MDD if ondisk atime is older than
ondisk mtime/ctime to match relatime (even if relatime is not enabled)

Add a new test for relatime feature.

Signed-off-by: Aurelien Degremont <degremoa@amazon.com>
Change-Id: I7a26f39841300a60c015944f9e544115b4446ead
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/47017
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
20 months agoLU-6142 ldlm: minor list_entry improvements in ldlm_request.c 38/49738/3
Mr. NeilBrown [Mon, 23 Jan 2023 21:51:18 +0000 (16:51 -0500)]
LU-6142 ldlm: minor list_entry improvements in ldlm_request.c

Small clarify improvements, and one local variable avoided.

Linux-commit: cb830bef04f1bd80da7eca3d3edaea590f4b350b

Change-Id: I1a34849adca228a465a2b771fb0aa707a9283c7c
Signed-off-by: Mr. NeilBrown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49738
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-6142 ldlm: use list_for_each_entry in ldlm_lock.c 34/49734/4
Mr. NeilBrown [Mon, 23 Jan 2023 21:24:59 +0000 (16:24 -0500)]
LU-6142 ldlm: use list_for_each_entry in ldlm_lock.c

This makes some slightly-confusing code a bit clearer, and
avoids the need for 'tmp'.

Linux-commit: 557d001aa51fd6171d7a68dec21f8327fc824173

Change-Id: If9d070492e0016fa235fb38726f7c7a3b380d580
Signed-off-by: Mr. NeilBrown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49734
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-12275 tests: skip new nodemap params on old MGS 28/49828/3
Andreas Dilger [Mon, 30 Jan 2023 21:46:37 +0000 (14:46 -0700)]
LU-12275 tests: skip new nodemap params on old MGS

Skip setting forbid_encryption and readonly_mount parameters on old
MGSes that do not support these options.  Otherwise test_61 failures
are seen during interop testing.  Running test_36 would also fail in
this case, except that it is already skipped due to encryption checks.

Test-Parameters: trivial testlist=sanity-sec
Fixes: 598c48707c ("LU-12275 tests: exercise file content encryption/decryption")
Fixes: e7ce67de92 ("LU-15451 sec: read-only nodemap flag")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I94f2e2f609927fea618a3a22f103bd32ae3ebbe5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49828
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-16412 llite: check read page past requested 23/49723/9
Qian Yingjin [Fri, 20 Jan 2023 17:30:27 +0000 (12:30 -0500)]
LU-16412 llite: check read page past requested

Due to a kernel bug introduced in 5.12 in commit:
cbd59c48ae2bcadc4a7599c29cf32fd3f9b78251
("mm/filemap: use head pages in generic_file_buffered_read")
if the page immediately after the current read is in cache,
the kernel will try to read it.

This attempts to read a page past the end of requested
read from userspace, and so has not been safely locked by
Lustre.

For a page after the end of the current read, check wether
it is under the protection of a DLM lock. If so, we take a
reference on the DLM lock until the page read has finished
and then release the reference.  If the page is not covered
by a DLM lock, then we are racing with the page being
removed from Lustre.  In that case, we return
AOP_TRUNCATED_PAGE, which makes the kernel release its
reference on the page and retry the page read.  This allows
the page to be removed from cache, so the kernel will not
find it and incorrectly attempt to read it again.

NB: Earlier versions of this description refer to stripe
boundaries, but the locking issue can occur whether or
not the page is on a stripe boundary, because dlmlocks
can cover part of a stripe.  (This is rare, but is
allowed.)

Change-Id: Ib93bd0624fda0ed1c2b89f609d15208c86e21c29
Signed-off-by: Qian Yingjin <qian@ddn.com>
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49723
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Zhenyu Xu <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-16457 tests: wait for remote sleep in sanity-pcc/101a 87/49587/9
Andreas Dilger [Tue, 10 Jan 2023 15:37:03 +0000 (08:37 -0700)]
LU-16457 tests: wait for remote sleep in sanity-pcc/101a

Wait longer for the remote sleep command to start on the agent node.

Test-Parameters: trivial testlist=sanity-pcc env=ONLY=101a,ONLY_REPEAT=200
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I5dcbd6a7127b3e17aa658c87f5c75874432dc353
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49587
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Feng Lei <flei@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-16159 osp: destroy should not overtake writes 87/49787/5
Alex Zhuravlev [Thu, 26 Jan 2023 07:34:25 +0000 (10:34 +0300)]
LU-16159 osp: destroy should not overtake writes

use transaction versioning for object destroy so that
destroy doesn't overtake writes, so writes don't hit
non-existing objects.

Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single env=ONLY="70b 71a 119",ONLY_REPEAT=10
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single env=ONLY="70b 71a 119",ONLY_REPEAT=10
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single env=ONLY="70b 71a 119",ONLY_REPEAT=10
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single env=ONLY="70b 71a 119",ONLY_REPEAT=10
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single env=ONLY="70b 71a 119",ONLY_REPEAT=10
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single env=ONLY="70b 71a 119",ONLY_REPEAT=10
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single env=ONLY="70b 71a 119",ONLY_REPEAT=10
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single env=ONLY="70b 71a 119",ONLY_REPEAT=10
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single env=ONLY="70b 71a 119",ONLY_REPEAT=10
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single env=ONLY="70b 71a 119",ONLY_REPEAT=10
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single env=ONLY="70b 71a 119",ONLY_REPEAT=10
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single env=ONLY="70b 71a 119",ONLY_REPEAT=10
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single env=ONLY="70b 71a 119",ONLY_REPEAT=10
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single env=ONLY="70b 71a 119",ONLY_REPEAT=10
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single env=ONLY="70b 71a 119",ONLY_REPEAT=10
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single env=ONLY="70b 71a 119",ONLY_REPEAT=10
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single env=ONLY="70b 71a 119",ONLY_REPEAT=10
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single env=ONLY="70b 71a 119",ONLY_REPEAT=10
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single env=ONLY="70b 71a 119",ONLY_REPEAT=10
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single env=ONLY="70b 71a 119",ONLY_REPEAT=10
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single env=ONLY="70b 71a 119",ONLY_REPEAT=10
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single env=ONLY="70b 71a 119",ONLY_REPEAT=10
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single env=ONLY="70b 71a 119",ONLY_REPEAT=10
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single env=ONLY="70b 71a 119",ONLY_REPEAT=10
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single env=ONLY="70b 71a 119",ONLY_REPEAT=10
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single env=ONLY="70b 71a 119",ONLY_REPEAT=10
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single env=ONLY="70b 71a 119",ONLY_REPEAT=10
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single env=ONLY="70b 71a 119",ONLY_REPEAT=10
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single env=ONLY="70b 71a 119",ONLY_REPEAT=10
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single env=ONLY="70b 71a 119",ONLY_REPEAT=10
Fixes: b054fcd785 ("LU-16159 lod: cancel update llogs upon recovery abort")
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Iec2a5c72f27825820d36ebbe20d55fa303358982
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49787
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
20 months agoLU-16431 mds: Close request is dropped during replay 06/49506/4
Andriy Skulysh [Mon, 21 Mar 2022 12:00:59 +0000 (14:00 +0200)]
LU-16431 mds: Close request is dropped during replay

MDS_CLOSE can have the same transno with SETATTR update.
But it still needs to be processed to close the file.

Change-Id: I44c8e10c5e30f2dca4fab4d49a74d147495640c2
HPE-bug-id: LUS-10838
Signed-off-by: Andriy Skulysh <andriy.skulysh@hpe.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49506
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
20 months agoLU-16392 utils: use --list-commands for bash completion 84/49484/9
Thomas Bertschinger [Wed, 21 Dec 2022 16:52:50 +0000 (11:52 -0500)]
LU-16392 utils: use --list-commands for bash completion

The CLI utils lctl and lfs currently use a pseudo option
--non-existent-option to generate a list of completions. However, this
was broken when the help output for an invalid command was changed.
Using --list-commands instead means that the format of the help output
can be kept succinct.

However, currently there are 2 issues that make --list-commands
unsuitable.

First, --list-commands truncates long commands. This commit resolves
this by not truncating long commands, and removing the fixed-length
char buffer and writing directly to stdout so that the line length
can overflow slightly if needed.

Second, --list-commands recursively displays sub-commands. For
example, for `lctl`, it will display `pcc add`, `pcc del`, etc in
additon to just `pcc`. The bash completion tools would view these
as separate tokens and thus would inappropriately suggest `add`,
`del`, etc. as completions for `lctl`. This commit removes the
recursive behavior.

Removing the recursive behavior resolves an unrelated bug with the
recursion that can be observed for `lctl`, where a number of
top-level commands are skipped following recursion into a previous
sub-command, equal to the number of subcommands processed in the
recursive call. Specifically, the commands in the section "device
setup", e.g. `attach`, `detach`, were not displayed following the
recursive call into `pcc`.

Finally, this commit changes the command parser to recognize --help
and print the list of commands when this argument is seen.

Fixes: bc69a8d058 ("LU-8621 utils: cmd help to stdout or short cmd error")
Signed-off-by: Thomas Bertschinger <bertschinger@lanl.gov>
Change-Id: Ib6e139402b9cd18e5a54b8fd3d6a2652d301e736
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49484
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
20 months agoLU-16382 spec: Declare correct license 63/49363/4
Mr NeilBrown [Mon, 12 Dec 2022 04:20:08 +0000 (15:20 +1100)]
LU-16382 spec: Declare correct license

Lustre is primarily licensed under GPL-v2.  Some files claim v2+,
others claim v2-only, but all are consistent with v2.

liblustreapi is LGPL2.1+

So make that explicit in lustre.spec.  All 'kmp' packages are
GPL-v2-only, all the rest add "AND LGPL-2.1-or-later.

The Open Build Service complains that "GPL" is too vague.

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I4f10c50a39b5b48fed71b179bc888b0ae144444e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49363
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Aurelien Degremont <degremoa@amazon.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: jsimmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-16310 sec: Lustre/HSM on enc file with enc key 53/49153/7
Sebastien Buisson [Mon, 14 Nov 2022 16:28:36 +0000 (17:28 +0100)]
LU-16310 sec: Lustre/HSM on enc file with enc key

Support for Lustre/HSM on encrypted files when the encryption key is
available requires similar attention as with file migration.
The volatile file used for HSM restore must have the same encryption
context as the Lustre file being restored, so that file content
remains accessible after the layout swap at the end of the restore
procedure.

Please note that using Lustre/HSM with the encryption key creates
clear text copies of encrypted files on the HSM backend storage.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I99cba202cd2c7c747bbe5c4ec7d9208c7f6baf4b
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49153
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: jsimmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Etienne AUJAMES <eaujames@ddn.com>
20 months agoLU-16205 sec: fid2path for encrypted files 30/48930/8
Sebastien Buisson [Thu, 3 Nov 2022 10:52:02 +0000 (11:52 +0100)]
LU-16205 sec: fid2path for encrypted files

Add support of fid2path for encrypted files. Server side returns raw
encrypted path name to client, which needs to process the returned
string. This is done from top to bottom, by iteratively decrypting
parent name and then doing a lookup on it, so that child can in turn
be decrypted.

For encrypted files that do not have their names encrypted, lookups
can be skipped. Indeed, name decryption is a no-op in this case, which
means it is not necessary to fetch the encryption key associated with
the parent inode.

Without the encryption key, lookups are skipped for the same reason.
But names have to be encoded and/or digested. So server needs to
insert FIDs of individual path components in the returned string.
These FIDs are interpreted by the client to build encoded/digested
names.

Add sanity-sec test_63 to exercise this new capability.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I165bf2e5657037ae2e25c9378e4713537ea94bec
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48930
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: jsimmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-13703 utils: fix lfs_migrate with PFL arguments 45/39145/21
Andreas Dilger [Wed, 17 Jun 2020 10:14:39 +0000 (04:14 -0600)]
LU-13703 utils: fix lfs_migrate with PFL arguments

Pass the '-c', '-S', and '--pool' options to "lfs migrate" when
they are part of a PFL component (after -E), rather than using
them to set the stripe_count and stripe_size of the whole file.

This precludes using '-A' and '-R' with explicitly specified PFL
file layouts, but that didn't make sense in the first place.

Fix the handling of "--pool <pool>" to use "-p <pool>" since
the script later only strips "-p " from the pool name.

Test-Parameters: trivial
Fixes: 60c5bc25025 ("LU-8235 scripts: pass unrecognized options to lfs migrate")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ib7fb6e08d81dbae77e8348fc5f09837c612540e5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/39145
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-13475 utils: disable lfs_migrate rsync and warning 14/40614/6
Andreas Dilger [Wed, 11 Nov 2020 21:53:22 +0000 (14:53 -0700)]
LU-13475 utils: disable lfs_migrate rsync and warning

The --rsync option is no longer enabled by default for fallback if
'lfs migrate' fails for some reason, and is mandatory for rsync usage.
The warning message and "-y" option of lfs_migrate is no longer needed
if rsync is not used, and is only shown if --rsync is used.

Remove the LFS_MIGRATE_RSYNC_MODE variable that was used for tests
and instead pass the "--rsync" option directly when needed.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I70b70d969f2dc8b4836c6c7692e6a73a0e2540e5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/40614
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-6142 ldlm: use list_for_each_entry in ldlm_resource.c 39/49739/2
Mr. NeilBrown [Mon, 23 Jan 2023 21:55:55 +0000 (16:55 -0500)]
LU-6142 ldlm: use list_for_each_entry in ldlm_resource.c

Having a stand-alone "list_entry()" call is often a sign
that something like "list_for_each_entry()" would
make the code clearer.

Linux-commit: 5eb50608ed0fa076d2783898055fb20934a3828c

Change-Id: I5abd6cc7ec0abd31acc55f5af58f440c4f7609a7
Signed-off-by: Mr. NeilBrown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49739
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-6142 ldlm: use list_first_entry in ldlm_lockd.c 37/49737/2
Mr. NeilBrown [Mon, 23 Jan 2023 21:45:05 +0000 (16:45 -0500)]
LU-6142 ldlm: use list_first_entry in ldlm_lockd.c

This is only a small simplification, but it makes the code
a little clearer.

Linux-commit: 7378caf4fe5198ce572654c926437fba12fb2255

Change-Id: Ie65049e12a1b1bbe448baefc38a6657d831e0670
Signed-off-by: Mr. NeilBrown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49737
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-6142 lustre: obdclass: simplify cl_lock_fini() 36/49736/2
Mr. NeilBrown [Mon, 23 Jan 2023 21:33:34 +0000 (16:33 -0500)]
LU-6142 lustre: obdclass: simplify cl_lock_fini()

Using list_first_entry_or_null() makes this (slightly)
simpler.

Linux-commit: 988b9ea9129bc24baf36ee421feb823285f234c4

Change-Id: Ic2fe2bb58b67781c8bc7b4e81cbf6b61dcaa56fb
Signed-off-by: Mr. NeilBrown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49736
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-6142 lov: simplfy lov_finish_set() 35/49735/2
Mr. NeilBrown [Mon, 23 Jan 2023 21:29:57 +0000 (16:29 -0500)]
LU-6142 lov: simplfy lov_finish_set()

When deleting everything from a list, a while loop
is cleaner than list_for_each_safe().

Linux-commit: dff162689a4061ff30d3a05f9d790e375c06ab8f

Change-Id: I90d98ebf14f461796d6f9d31a2c62de1520034cc
Signed-off-by: Mr. NeilBrown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49735
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-16349 o2iblnd: Fix key mismatch issue 14/49714/3
Dean Luick [Thu, 19 Jan 2023 20:38:04 +0000 (21:38 +0100)]
LU-16349 o2iblnd: Fix key mismatch issue

If a pool memory region (mr) is mapped then unmapped without being
used, its key becomes out of sync with the RDMA subsystem.

At pool mr map time, the present code will create a local
invalidate work request (wr) using the mr's present key and then
change the mr's key.  When the mr is first used after being mapped,
the local invalidate wr will invalidate the original mr key, and
then a fast register wr is used with the modified key.  The fast
register will update the RDMA subsystem's key for the mr.

The error occurs when the mr is never used.  The next time the mr
is mapped, a local invalidate wr will again be created, but this
time it will use the mr's modified key.  The RDMA subsystem never
saw the original local invalidate, so now the RDMA subsystem's
key for the mr and o2iblnd's key for the mr are out of sync.

Fix the issue by tracking if the invalidate has been used.
Repurpose the boolean frd->frd_valid.  Presently, frd_valid is
always false.  Remove the code that used frd_valid to conditionally
split the invalidate from the fast register.  Instead, use frd_valid
to indicate when a new invalidate needs to be generated.  After a
post, evaluate if the invalidate was successfully used in the post.

These changes are only meaningful to the FRWR code path.  The failure
has only been observed when using Omni-Path Architecture.

Signed-off-by: Cyril Bordage <cbordage@whamcloud.com>
Change-Id: I532a11f10ae6a5917a4c054f37747d08eb4d6331
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49714
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-13482 utils: bandwidth limit for lfs migrate 20/49620/4
Timothy Day [Tue, 10 Jan 2023 04:55:47 +0000 (04:55 +0000)]
LU-13482 utils: bandwidth limit for lfs migrate

Add an option -W to control how much bandwidth
an lfs migrate job can consume. The migrate job
will periodically sleep to meet the bandwidth
restrictions.

This patch also adds a --stats option. The option
produces regular logs entries tracking the progress
of the migrate job. The logs are output in YAML
format. The frequency of the logs is controlled
by --stats-interval. This interval defaults to 5
seconds.

Also included are two tests, 56xh and 56xi. The
first verifies the functionality of the bandwidth
control. The second checks that the output is in
valid YAML and that the stats get printed without
using -W.

Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Ic71cceb2434a737e3ad8bd325f719e37a70b0047
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49620
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-6142 ldlm: Fix style issues for ldlm_extent.c 36/49536/6
Arshad Hussain [Fri, 23 Dec 2022 12:26:27 +0000 (17:56 +0530)]
LU-6142 ldlm: Fix style issues for ldlm_extent.c

This patch fixes issues reported by checkpatch
for file lustre/ldlm/ldlm_extent.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I9cecd1f377f33f3d4129cddcd7b59c3a7c003e04
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49536
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
21 months agoLU-14980 lfsck: lock object in __lfsck_layout_update_pfid() 23/44823/28
Alex Zhuravlev [Thu, 2 Sep 2021 15:50:19 +0000 (18:50 +0300)]
LU-14980 lfsck: lock object in __lfsck_layout_update_pfid()

once the transaction has been started

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Ie43fe89009a123c88eb0e202ec961b52157e56c6
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/44823
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
21 months agoLU-14692 tests: restore sanity/312 to always_except 20/49720/2
Andreas Dilger [Fri, 20 Jan 2023 06:22:18 +0000 (23:22 -0700)]
LU-14692 tests: restore sanity/312 to always_except

The sanity test_312 was incorrectly removed from ALWAYS_EXCEPT.

Fixes: eaae465556 ("LU-14692 tests: allow FID_SEQ_NORMAL for MDT0000")
Test-Parameters: trivial testlist=sanity fstype=zfs
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I6e8ed42561809b28fd6d5b4f7ee1104080ebe756
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49720
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
21 months agoLU-16492 tests: sanity/398b variable used without assignment 87/49687/2
Arshad Hussain [Thu, 19 Jan 2023 02:41:43 +0000 (08:11 +0530)]
LU-16492 tests: sanity/398b variable used without assignment

This patch initilizes 'before' variable with UNIX
timestamp. Variable 'before' was used without assigning
any value.

Test-Parameters: trivial testlist=sanity env=ONLY=398b
Fixes: b4880f37582a ("LU-15483 tests: Improve test 398b")
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: Iba9e361735272d9c640a115f520ee7c60ac41239
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49687
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-16425 tests: skip interop recovery-small/144a/144b 79/49679/6
Andreas Dilger [Wed, 18 Jan 2023 18:26:09 +0000 (11:26 -0700)]
LU-16425 tests: skip interop recovery-small/144a/144b

Skip recovery-small test_144a and test_144b for old MDS
missing the fix and for its corresponding test.

Fixes: 240938f7b1 ("LU-8367 tests: cleanup_orphans hang reproducer")
Fixes: aa6250b741 ("LU-15724 tests: MDT failover hang reproducer")
Test-Parameters: trivial testlist=recovery-small env=ONLY=144 serverversion=2.14.0
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I77bfdf55d0218aa9e252f742cc90f1c61216d506
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49679
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sarah Liu <sarah@whamcloud.com>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-15581 misc: update .gitignore files 32/49632/3
Timothy Day [Fri, 13 Jan 2023 19:33:15 +0000 (19:33 +0000)]
LU-15581 misc: update .gitignore files

Ignore the binary for check_iam utility
in lustre/utils.

Also, ignore more files for commit messages.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I5b11dc2d2f3761f778549a121ac940edeeb70980
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49632
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: jsimmons <jsimmons@infradead.org>
21 months agoLU-14707 tests: Prefer #!/bin/bash 79/49479/4
Timothy Day [Thu, 15 Dec 2022 06:19:01 +0000 (06:19 +0000)]
LU-14707 tests: Prefer #!/bin/bash

Change remaining #!/bin/sh to use bash.
Add a warning to the git-hook about using
sh in shebangs. Using bash allows scripts to
freely use bash-isms and lowers the risks
of bugs on Debian based platforms.

Also, change remaining callers to use bash
rather than sh.

Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I10f3e8f71435c38cfc1650dd13168d7ed5d3b31f
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49479
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: jsimmons <jsimmons@infradead.org>
21 months agoLU-16412 llite: check truncated page in ->readpage() 33/49433/6
Qian Yingjin [Mon, 19 Dec 2022 06:57:39 +0000 (01:57 -0500)]
LU-16412 llite: check truncated page in ->readpage()

The page end offset calculation in filemap_get_read_batch() was
off by one. This bug was introduced in commit v5.11-10234-gcbd59c48ae
("mm/filemap: use head pages in generic_file_buffered_read")

When a read is submitted with end offset 1048575, it calculates
the end page index for read of 256 where it should be 255. This
results in the readpage() call for the page with index 256 is over
stripe boundary and may not be covered by a DLM extent lock.

This happens in a corner race case: filemap_get_read_batch()
batches the page with index 256 for read, but later this page is
removed from page cache due to the lock protected it being revoked,
but has a reference count due to the batch.  This results in this
page in the read path is not covered by any DLM lock.

The solution is simple. We can check whether the page was
truncated and removed from page cache in ->readpage() by the
address_sapce pointer of the page. If it was truncated, return
AOP_TRUNCATED_PAGE to the upper caller.  This will cause the
kernel to retry to batch pages and the truncated page will not
be added as it was already removed from page cache of the file.

Add sanityn/test_95 to verify it.

Test-Parameters: testlist=sanityn env=ONLY=95 clientdistro=ubuntu2204
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I192df92b1d1b79057055430cc81cb7cc760cc9ed
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49433
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Zhenyu Xu <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-15880 quota: fix insane grant quota 81/48981/11
Hongchao Zhang [Mon, 16 Jan 2023 02:21:09 +0000 (21:21 -0500)]
LU-15880 quota: fix insane grant quota

Fix the insane grant value in quota master/slave index,
the logs often contain the content similar to the following,

LustreError: 39815:0:(qmt_handler.c:527:qmt_dqacq0())
$$$ Release too much! uuid:work-MDT0000-lwp-MDT0002_UUID
release:18446744070274413724 granted:18446744070291193856,
total:4118877744 qmt:work-QMT0000 pool:0-dt id:40212 enforced:1
hard:128849018880 soft:12884901888 granted:4118877744 time:0
qunit: 16777216 edquot:0 may_rel:0 revoke:0 default:no

It could be caused by chgrp, which reserves quota before changing
GID for some file at MDT, then release the reserved quota after
the file GID has been changed on the corresponding OST, (this issue
is tracked at LU-5152 and LU-11303)

In some case, some quota could be released even the quota was not
reserved correctly, which cause the grant quota to be some negative
value, which is regarded as some insane big value because the type
of grant is "__u64", then the normal grant release will fail and
the grant field of some quota ID in the quota file (both at QMT and
QSD) contain insane value, but can't be reset correctly.

This patch resets the affected quota by clear the quota limits and
grant, and the grant will be reported by each QSD when the quota ID
is enforced again, then rebuild the grant at QMT.

Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Change-Id: I083afa3b6648db5a1ccca0235667da022ff27e65
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48981
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sergey Cheremencev <scherementsev@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-16374 enc: align Base64 encoding with RFC 4648 base64url 81/49581/3
Sebastien Buisson [Sun, 18 Jul 2021 00:01:25 +0000 (19:01 -0500)]
LU-16374 enc: align Base64 encoding with RFC 4648 base64url

Lustre encryption uses a Base64 encoding to encode no-key filenames
(the filenames that are presented to userspace when a directory is
listed without its encryption key).
Make this Base64 encoding compliant with RFC 4648 base64url. And use
'+' leading character to distringuish digested names.

This is adapted from kernel commit
ba47b515f594 fscrypt: align Base64 encoding with RFC 4648 base64url

To maintain compatibility with older clients, a new llite parameter
named 'filename_enc_use_old_base64' is introduced, set to 0 by
default. When 0, Lustre uses new-fashion base64 encoding. When set to
1, Lustre uses old-style base64 encoding.

To set this parameter globally for all clients, do on the MGS:
mgs# lctl set_param -P llite.*.filename_enc_use_old_base64={0,1}

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Iaa2256da7fb591d842b5bb7aa474b2ee6de9899d
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49581
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jsimmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: jsimmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-16476 ldiskfs: Fix old ea inode handling 34/49634/5
Shaun Tancheff [Sat, 14 Jan 2023 07:13:01 +0000 (01:13 -0600)]
LU-16476 ldiskfs: Fix old ea inode handling

ext4-old_ea_inodes_handling_fix.patch is applicable
to all linux version 4.18 and higher.

Apply it to all the current 5* series

Test-Parameters: trivial
HPE-bug-id: LUS-11441
Fixes: 8da23f070c ("LU-15544 ldiskfs: SUSE 15 SP4 kernel 5.14.21 SUSE")
Fixes: 1819f6006f ("LU-15801 ldiskfs: Server support for RHEL9")
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I1d22ed9d505e1bb407d9388cac9c881b366b96a8
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49634
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: jsimmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
21 months agoLU-16096 recovery: upgrade reply data after recovery finish 61/48261/15
Qian Yingjin [Fri, 19 Aug 2022 02:32:36 +0000 (22:32 -0400)]
LU-16096 recovery: upgrade reply data after recovery finish

As the batched RPC protocol will change the disk format of the
client reply data "REPLY_DATA" for recovery, thus we need to
handle the compatibility during upgrade carefully for this new
format change of the reply data.

The solution is as follows:
When the client recovery has finished, the target truncates the
reply data file with zero size and rewrite the header to use the
new magic and reply data record size.
And then new reply data records will be written in the new format.

Enable the test case conf-sanity/32, 108 as the compatibility issue
is fixed.

This patch also fixes the usage of struct lsd_reply_data in
lustre/utils/lr_reader.c to support both struct versions.

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I26921d41915b8cad2d913e15f502f4543180c5c6
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48261
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>