Whamcloud - gitweb
fs/lustre-release.git
4 years agoLU-14479 ssk: explicitly set perm on key
Sebastien Buisson [Mon, 8 Mar 2021 14:20:00 +0000 (15:20 +0100)]
LU-14479 ssk: explicitly set perm on key

When an SSK key is loaded, either via lgss_sk command or thanks to
skpath mount option, try to set permissions on the key.
This is to avoid a 'Permission denied' error when a Lustre client or
server wants to make use of the key later on.

Lustre-change: https://review.whamcloud.com/41929
Lustre-commit: f265033840996dcdffb2f05a64b51b51391a273c

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I1ed712ae4d07be306cc76b4e59fab303437558bb
Reviewed-on: https://review.whamcloud.com/43164
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoLU-14534 gss: do not refresh context for LDLM callback
Sebastien Buisson [Thu, 18 Mar 2021 16:17:31 +0000 (17:17 +0100)]
LU-14534 gss: do not refresh context for LDLM callback

If the request to be sent is an LDLM callback, do not try to
refresh context.
An LDLM callback is sent by a server to a client in order to make
it release a lock, on a communication channel that uses a reverse
context. It cannot be refreshed on its own, as it is the 'reverse'
(server-side) representation of a client context.
We do not care if the reverse context is expired, and want to send
the LDLM callback anyway. Once the client receives the AST, it is
its job to refresh its own context if it has expired, hence
refreshing the associated reverse context on server side, before
being able to send the LDLM_CANCEL requested by the server.

Lustre-change: https://review.whamcloud.com/42076
Lustre-commit: 1769f262b96745b61b21fd1450cc4c0386a41b95

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Ic8f4fe203f16ed5cfafd3da355c78cf58d96c3eb
Reviewed-on: https://review.whamcloud.com/43173
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoEX-2778 lipe: lipe.spec fixes
John L. Hammond [Fri, 5 Mar 2021 14:55:39 +0000 (08:55 -0600)]
EX-2778 lipe: lipe.spec fixes

In lipe.spec.in, call install without specifying file ownership. Fixup
some bogus changelog dates.

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: I14946251ef9b39a8bab9f9c53a46d3c544ded240
Reviewed-on: https://review.whamcloud.com/43158
Tested-by: jenkins <devops@whamcloud.com>
4 years agoEX-2930 lipe: fix includes
John L. Hammond [Mon, 29 Mar 2021 14:11:53 +0000 (09:11 -0500)]
EX-2930 lipe: fix includes

In lipe/src/lipe_expression_test.c, include the headers we need and
replace <debug.h> with "debug.h".

Test-Parameters: trivial
Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: I1c416da5bb61b3219025d93706dfb6e798fccc1c
Reviewed-on: https://review.whamcloud.com/43156
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoEX-2930 lipe: fix errno.h include
John L. Hammond [Fri, 26 Mar 2021 14:20:10 +0000 (09:20 -0500)]
EX-2930 lipe: fix errno.h include

In lipe/src/lustre_ea.c, include "errno.h" rather than <errno.h>.

Test-Parameters: trivial
Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: I91c7106502cb5f1da04cdf27071584233473f469
Reviewed-on: https://review.whamcloud.com/43133
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43143

4 years agoRM-620 build: New tag 2.14.0-ddn2
Andreas Dilger [Sat, 27 Mar 2021 06:58:02 +0000 (00:58 -0600)]
RM-620 build: New tag 2.14.0-ddn2

New tag 2.14.0-ddn2

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I86c51faa9eb69723465332cc9e132a4e2915bc14

4 years agoLU-14499 revert: LU-13368 lnet: discard the callback
Serguei Smirnov [Mon, 8 Mar 2021 17:46:03 +0000 (09:46 -0800)]
LU-14499 revert: LU-13368 lnet: discard the callback

The changes introduced by LU-13368 have been shown to cause
the o2iblnd shutdown procedure to hang on lustre_rmmod
as it infinitely waits for peers to disconnect. Revert it.
This reverts commit babf0232273467b7199ec9a7c36047b1968913df.

Lustre-change: https://review.whamcloud.com/41937
Lustre-commit: TBD (from 9a1b64724bdb9452a6c3e14a92c7ef341173d19b)

Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: I489ae4af445b18df852ec35adc958c4fac33de09
Reviewed-on: https://review.whamcloud.com/42117
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoEX-2932 llapi: fix '%llu' type mismatch on ppc64le
Minh Diep [Fri, 26 Mar 2021 21:14:58 +0000 (14:14 -0700)]
EX-2932 llapi: fix '%llu' type mismatch on ppc64le

The ppc64le architecture unfortunately defines "__u64" as "long"

Change-Id: I0941f0345df101031cdd44c3ac77220ff6b4cc5b
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43144
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
4 years agoRM-620 build: New tag 2.14.0-ddn1
Andreas Dilger [Sun, 7 Mar 2021 06:52:23 +0000 (23:52 -0700)]
RM-620 build: New tag 2.14.0-ddn1

New tag 2.14.0-ddn1

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ic113f1c81a31b132da5ed2dcf0378d47553ebbe5

4 years agoLU-10499 pcc: introducing OBD_CONNECT2_PCCRO flag
Qian Yingjin [Mon, 30 Nov 2020 02:08:17 +0000 (10:08 +0800)]
LU-10499 pcc: introducing OBD_CONNECT2_PCCRO flag

Add a new connection flag OBD_CONNECT2_PCCRO to solve the access
consistency from the old client without PCC-RO support.

Lustre-change: https://review.whamcloud.com/40791
Lustre-commit: TBD (from d9ac6b2e7eaaad892a2ecd0460b0f6915216c1cd)

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I19716e94a86e53353c1628d414c92e61e084dfc9
Reviewed-on: https://review.whamcloud.com/43105
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoEX-2873 pcc: async attach in the background for PCC-RO file
Qian Yingjin [Mon, 22 Mar 2021 09:16:15 +0000 (17:16 +0800)]
EX-2873 pcc: async attach in the background for PCC-RO file

In current PCC, it may have a long delay while the whole file is
being copied into the cache before it can be used. There is a
significant delay for the first file access if the file is large,
which wastes valuable computing time. Being able to shorten this
time to first access may help application efficiency.

In this patch, it adds an tuning parameter "async_threshold",
which means the size threshold to determine doing PCC-RO attach
asynchronously in the background.

When the file size is samller than the threshold, the PCC attach
during open() will be performed in synchronous way.
Otherwise, the client will start a dedicated kernel thread to
copy data from Lustre OSTs to the PCC copy in the background, but
reads could fall back to the normal Lustre I/O path from Lustre
OSTs until the file is fully cached.

This may double the reads to the Lustre filesystem initially if
the file is not read sequentially, but would avoid the high
latency for data access. This may be some cache sharing (avoiding
double reads) if the PCC copy and the application both shared
the filesystem cached pages on the client.

The tuning parameter "llite.*.pcc_async_threshold" is set with
256MiB by default.

Test-Parameters: clientcount=3 testlist=sanity-pcc,sanity-pcc,sanity-pcc
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: Ia80992e9050cc6e4c7f61949fc4013dec303e150
Reviewed-on: https://review.whamcloud.com/42125
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoLU-12358 pcc: add project quota support on PCC backend
Qian Yingjin [Sun, 6 Sep 2020 08:52:04 +0000 (16:52 +0800)]
LU-12358 pcc: add project quota support on PCC backend

Current PCC can enforce a quota limitation of the capacity usage
for each user and group to provide cache isolation. An admin
can specify the quota enforcement on the local PCC file system.

Users can perform PCC-cached I/O on files until they receive a
return value -ENOSPC of -EDQUOT, which means that they hit the
quota limit or that there is no free capacity left on the local
PCC backend fs during I/O or the attach process. At this time,
I/O will fall back to the normal I/O path.

This patch adds project quota on the PCC backend file system
along with user/group quota.

With this feature, it can have multiple PCC backends on a single
client with different caching rules, so we can define upfront
how much of the client FS can be used for each cache.

Test-Parameters: clientcount=3 testlist=sanity-pcc,sanity-pcc,sanity-pcc
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: Ib93da953d4a3a7091f62094f8175bde91e819895
Reviewed-on: https://review.whamcloud.com/41928
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
4 years agoEX-2455 pcc: get PCC state for a file without opening itself
Qian Yingjin [Thu, 25 Feb 2021 12:43:58 +0000 (20:43 +0800)]
EX-2455 pcc: get PCC state for a file without opening itself

Originally to get PCC state for a given file, the user needs to
open the file and then get the current PCC state of the file via
the file handle. After that, close the file.

If the file is met the predefined condition of auto prefetching
into PCC at the open time, "lfs pcc state" command on the file
will attach the file into PCC cache. This may be not the intention
of the user.

In this patch, we rework the "lfs pcc state" command. It always
open the parent directory, and then do the lookup by name/FID
without open the file itself to get the PCC state.

Test-Parameters: clientcount=3 testlist=sanity-pcc,sanity-pcc,sanity-pcc
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I310a7e73dc6c0f4318dc27df2e02ecf6559ee5b4
Reviewed-on: https://review.whamcloud.com/41927
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoEX-2455 pcc: check first before set PCC-RO on a file
Qian Yingjin [Fri, 5 Feb 2021 03:48:26 +0000 (11:48 +0800)]
EX-2455 pcc: check first before set PCC-RO on a file

In this patch, MDT takes a CR layout lock against the file object
first to check whether the file is already PCC-RO cached. If so,
return immediately; Otherwise, take an EX lock on the file to
update the FLR PCC-RO state accordingly. By this check, it can
avoid heavy lock contention and unnecessary revocation of the
layout lock granted to the other clients when multiple processes
from many clients perform read-only attach on a shared file
simultaneously.

It also adds the layout intent write (LAYOUT_INTENT_PCCRO_SET
and LAYOUT_INTENT_PCCRO_CLEAR) with FMODE_WRITE flag, so that
the conflict lock can be revoked via the ELC strategy, avoiding
unnecessary lock traffic.

Test-Parameters: clientcount=3 testlist=sanity-pcc,sanity-pcc,sanity-pcc
Change-Id: Id01ea69335ad8ad46bade356327644e0dfb571cc
Signed-off-by: Qian Yingjin <qian@ddn.com>
Reviewed-on: https://review.whamcloud.com/41926
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoEX-2921 lipe: add tools/lipe as lipe subtree
John L. Hammond [Thu, 25 Mar 2021 16:45:34 +0000 (11:45 -0500)]
EX-2921 lipe: add tools/lipe as lipe subtree

Merge commit 'e2a8a03f3599c42f85955d9c0339e9cc6a570214' as 'lipe'

git remote add tools/lipe ssh://review.whamcloud.com:29418/tools/lipe
git fetch --no-tags tools/lipe
git subtree add --prefix=lipe --squash tools/lipe/master

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: I6e4e6e5349ea42beee3cc202a9bdaf7e29fb5b12

4 years agoSquashed 'lipe/' content from commit 38f79e56ec
John L. Hammond [Thu, 25 Mar 2021 16:45:34 +0000 (11:45 -0500)]
Squashed 'lipe/' content from commit 38f79e56ec

git-subtree-dir: lipe
git-subtree-split: 38f79e56ec2816cefda2e6d8d3e1f56f1992549d

4 years agoLU-12373 pcc: delete stale PCC copy when remove PCC backend
Qian Yingjin [Thu, 22 Oct 2020 08:22:45 +0000 (16:22 +0800)]
LU-12373 pcc: delete stale PCC copy when remove PCC backend

By defualt, when removing a PCC backend from a client, the action
is to scan the PCC backend FS, uncache (detach and remove) all
scanned PCC copies from PCC by FIDs.

However, during the tests, we found that some old stale PCC copies
are not removed when an adminstrator runs "lctl pcc del|clear".
The reason is that these PCC copies are already detached from PCC
when running the commands.

This patch fixes this bug: when removing a PCC backend from a
client, it will also delete all non-cached PCC copies from PCC
backend to free up the space.

Test-Parameters: clientcount=3 testlist=sanity-pcc,sanity-pcc,sanity-pcc
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: Id829abe7e6cb1294e6baea76452f4a9178711451
Reviewed-on: https://review.whamcloud.com/41925
Reviewed-by: Li Xi <lixi@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoLU-14003 pcc: convert mapping pagecache for mmap
Qian Yingjin [Thu, 22 Oct 2020 01:29:12 +0000 (09:29 +0800)]
LU-14003 pcc: convert mapping pagecache for mmap

In the PCC mmap implementation, it will replace the mapping of
the PCC copy with the one of the Lustre file when do mmap() to
make the mmapped region (vma) link into the mapping of the
Lustre file not the mapping of the PCC copy.
At this time, in the old design the pagecache in the original
mapping of the PCC copy is simply dropped as the mapping of each
page is different after the replacement of the mapping.

This may have negative impact on the mmap performance.
The reason is that during PCC attach it will write the data from
Lustre into PCC copy in buffered I/O mode, these data will keep
in pagecache and managed by the mapping of the PCC copy if there
is enough system memory. Then for the latter mmap, the page fault
could directly read data from the pagecache to speed up the mmap
operation.
If drop these pagecahe due to the different mapping of each pages,
the page fault must read page from the disk and may result in bad
performance.

To make full use of these pagecache of the PCC copy, during mmap
call, it can first remove the page from the original mapping of
the PCC copy, and then convert and add it into the mapping of the
Lustre file. By this way, all pagecaches are converted and can be
reused for the latter page fault.

Test-Parameters: clientcount=3 testlist=sanity-pcc,sanity-pcc,sanity-pcc
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I1591937543d7d31b8811ec62088accd0070d7d37
Reviewed-on: https://review.whamcloud.com/41924
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoLU-14003 pcc: rework PCC mmap implementation
Qian Yingjin [Wed, 30 Sep 2020 03:00:43 +0000 (11:00 +0800)]
LU-14003 pcc: rework PCC mmap implementation

In the old PCC mmap implementation, it replaces the vm_file with
the file of the PCC copy, and then call ->fault() or
->page_mkwrite() on the PCC copy, after that restore the vm_file
with the one of the Lustre file.
This design exists problem as a mmaped region (vma) could be
faulted concurrently with multiple children threads (each children
threads can clone the VM of the parent process). There is no any
atomic guarantee for the replacement and restore the vm_file during
calling ->fault() or ->page_mkwrite().

This patch reworks the mmap() implementation for PCC.
In the new design, PCC mmap replaces the inode mapping of the PCC
copy on the PCC backend filesystem with the one of the Lustre file.
By this way, the mmaped region (vma) will link into the mapping of
the Lustre inode not the mapping of the PCC copy.
It keeps using vm_file with the file handle of the PCC copy until
the PCC cached file is detached or unmmaped.

Test-Parameters: clientcount=3 testlist=sanity-pcc,sanity-pcc,sanity-pcc
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: Icc5019a691dfb04b5e1fdd580d83915cfe590158
Reviewed-on: https://review.whamcloud.com/41923
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoLU-13881 pcc: comparator support for PCC rules
Qian Yingjin [Thu, 6 Aug 2020 08:29:21 +0000 (16:29 +0800)]
LU-13881 pcc: comparator support for PCC rules

There are increasing requirements for PCC rules to add comparator
support:
- File data larger or smaller than certain threshold should not
  auto cache in PCC (i.e. larger than the capacity of PCC backend
  on a client).
- Users can specify a range of UID/GID/ProjID for auto caching on
  PCC when define a rule;

In addition to the original equal (=) operator, this patch also
adds greater than (>) and less than (<) comparison operators.

The following rule expressions are supported:
- "projid={100}&size>{1M}&size<{500G}"
- "projid>{100}&projid<{110}"
- "uid<{1500}&uid>{1000}"

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I9f024eb6903f5652ba3cf04fa289456803493b2c
Reviewed-on: https://review.whamcloud.com/41920
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoLU-12373 pcc: uncache the pcc copies when remove a PCC backend
Qian Yingjin [Fri, 14 Jun 2019 09:29:55 +0000 (05:29 -0400)]
LU-12373 pcc: uncache the pcc copies when remove a PCC backend

Currently when remove a PCC backend from a client, it does not
make any special handling for previously cached files at all.
Users can still use PCC caching service for these files. This
may not what users want. The reason is as follows:

1) For RW-PCC cached files, it does not restore the data back
into Lustre OSTs of the main filesystem. Although the PCC
backend falls back as a tranditional HSM storage solution
since the lhsmtool_posix copytool is still running at this
client. But this is dangerous, and likly to cause user data
to be lost if the PCC device may be permanently unavailable.

2) The space used by these PCC cached files may not released.

In this patch, when remove a PCC backend from a client, the
default action is to scan the PCC backend fs, uncache
(detach and remove) the PCC copy from PCC by FID.

We also add an option "--keep|-k" for PCC backend removal.
It behaves as before, just remove the PCC backend, but
retain the data on the cache.

This patch also introduces a common library to scan the HSM
backend.

Test-Parameters: clientcount=3 testlist=sanity-pcc,sanity-pcc,sanity-pcc
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: Ib4db36137c025fd78c7022c8b8c39b63e3b9ad4d
Reviewed-on: https://review.whamcloud.com/41919
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoLU-10918 pcc: auto RO-PCC caching when O_RDONLY open files
Qian Yingjin [Wed, 22 Aug 2018 13:19:48 +0000 (21:19 +0800)]
LU-10918 pcc: auto RO-PCC caching when O_RDONLY open files

During the file open() operation, if the file is being opened with
O_RDONLY flags, and the file matches the predefined rule, it will
be prefetched and attached into RO-PCC automatically.

Test-Parameters: clientcount=3 testlist=sanity-pcc,sanity-pcc,sanity-pcc
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: Ib2c2ab51d67aed84eb7676c8df191faa33dfad39
Reviewed-on: https://review.whamcloud.com/41918
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoLU-10499 pcc: add readonly mode for PCC
Qian Yingjin [Mon, 23 Jul 2018 14:19:25 +0000 (22:19 +0800)]
LU-10499 pcc: add readonly mode for PCC

Readonly Persistent Client Cache (RO-PCC) shares the same framework
with Readwrite Persistent Client Cache, expect that no HSM mechanism
is used in readonly mode of PCC. Instead, RO-PCC adds a new flag
field in the file object's layout named LOV_PATTERN_F_RDONLY to
indicate that the file is in PCC read-only state. It is protected
under the layout lock.

After introducing the readonly feature for the layout, the IO path
has some changes. For read, if the file has been valid RO-PCC
cached, the file data can be read from PCC directly; Otherwise, it
will read data using normal I/O path from OSTs. For data modifying
operations (write or truncate), it must clear the readonly flag of
the layout on MDT (which will invaliate the RO-PCC cached state on
clients via layout lock blocking callback), and then it can perform
I/O.

For RO-PCC, as the PCC cached file is actual a replication of
Lustre file, when data read on PCC failed, it can tolerate this
error by falling back to normal read path: read data from OSTs.

This patch also combines PCC-RO with FLR. Similar to the plain
layouts, PCC-RO layouts is a kind of HSM non-composite layouts,
can be treated as a basic mirror component in FLR layouts.

Test-Parameters: clientcount=3 testlist=sanity-pcc,sanity-pcc,sanity-pcc
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I6badd72e00a106a0f68950621ce6f82471731a95
Reviewed-on: https://review.whamcloud.com/41917
Reviewed-by: Li Xi <lixi@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoLU-14503 o2iblnd: clean up zombie connections on shutdown
Serguei Smirnov [Thu, 18 Mar 2021 04:00:28 +0000 (21:00 -0700)]
LU-14503 o2iblnd: clean up zombie connections on shutdown

Clean up zombie connections on net shutdown in o2iblnd.
Wake up connd threads and wait for them to do the clean-up
before proceeding.

Lustre-change: https://review.whamcloud.com/42068
Lustre-commit: 016029d97a8af446452b9934f4a01d4ea800ea7e

Test-Parameters: trivial
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: Ib094e2f480077034e78fe90e2aec9b1349f7e708
Reviewed-on: https://review.whamcloud.com/42069
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoRevert "LU-13344 all: Separate debugfs and procfs handling"
James Nunez [Thu, 17 Dec 2020 18:21:01 +0000 (11:21 -0700)]
Revert "LU-13344 all: Separate debugfs and procfs handling"

This reverts commit 76626d6c52b19b5cca04007c4b1656cc52a487c1.

A performance regression was found, see LU-14055, and
tracked to this patch.

Change-Id: Ic094d840cd22ff0606f859ffb38280041c7efc0e
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/42049
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoRM-633 ldiskfs: fix conflict in rhel7.7/ext4-loadbitmaps.patch
Jian Yu [Fri, 19 Mar 2021 00:44:55 +0000 (17:44 -0700)]
RM-633 ldiskfs: fix conflict in rhel7.7/ext4-loadbitmaps.patch

This patch fixes the conflict in fs/ext4/super.c.

Test-Parameters: trivial
Fixes: c5b92c3ec0 ("RM-633 ldiskfs: add loadbitmaps to load block bitmaps for rhel7.7")
Change-Id: Ia7a79fb7abc7c99866d7cd885bca837bde3ebbe5
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/42096
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoEX-2745 tests: reduce file size for hot-pools.sh test 56
Jian Yu [Thu, 11 Mar 2021 06:21:56 +0000 (22:21 -0800)]
EX-2745 tests: reduce file size for hot-pools.sh test 56

This patch reduces the file size for hot-pools.sh test 56
by adjusting lpurge freelo and freehi options so as to
trigger lpurge after writing files with smaller size.

Test-Parameters: trivial testlist=hot-pools,hot-pools \
serverextra_install_params="--lipe-job lipe --lipe-build 0 -k zfs"

Test-Parameters: trivial testgroup=review-dne-part-2 \
serverextra_install_params="--lipe-job lipe --lipe-build 0 -k zfs"

Change-Id: I15c3a11e6453ec24d11c188ae82a980587259e80
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/42027
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoLU-14460 lnet: fix mismatched printf format
Lei Feng [Thu, 25 Feb 2021 00:31:56 +0000 (08:31 +0800)]
LU-14460 lnet: fix mismatched printf format

Original "%llx" does not work on all platforms. Fix it.

Lustre-change: https://review.whamcloud.com/41755
Lustre-commit: 58e05ff5af3d1fcd7b059dc56955a5f8d94db4ab

Signed-off-by: Lei Feng <flei@whamcloud.com>
Change-Id: I2edecbf66ccb2141c72294d324ade79574f5c084
Test-Parameters: trivial
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/42052
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-14401 sec: fix migrate for encrypted dir
Sebastien Buisson [Thu, 4 Feb 2021 08:22:56 +0000 (17:22 +0900)]
LU-14401 sec: fix migrate for encrypted dir

When setting an encryption policy on a directory that we want to
be encrypted, we need to make sure it is empty.
But, in some cases, setting the LL_XATTR_NAME_ENCRYPTION_CONTEXT xattr
should be allowed on non-empty directories, for instance when a
directory is migrated across MDTs into new shard directories.
Also, it is required for the encrpytion key to be available on the
client when migrating a directory so that the filenames can be
properly rehashed for the new MDT directory shard.
And, in any case, we need to prevent explicit setting of
LL_XATTR_NAME_ENCRYPTION_CONTEXT xattr outside of encryption policy
definition.

Update sanity-sec test_49 to test migration of non-empty encrypted
directory, and add sanity-sec test_57 to test security.c protection.

Lustre-change: https://review.whamcloud.com/41413
Lustre-commit: 67c4cffac6dbd30ce30e1d3132b65d4e4a374dda

Test-Parameters: clientdistro=el8.3 testlist=sanity-sec
Fixes: e8f74fb0f5 ("LU-12275 sec: verify dir is empty when setting enc policy")
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I2466ea35a871c6c07bdcf9fba7191485e855e655
Reviewed-on: https://review.whamcloud.com/42043
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoLU-14373 kernel: kernel update RHEL8.3 [4.18.0-240.10.1.el8_3]
Jian Yu [Tue, 9 Mar 2021 17:35:13 +0000 (09:35 -0800)]
LU-14373 kernel: kernel update RHEL8.3 [4.18.0-240.10.1.el8_3]

Update RHEL8.3 kernel to 4.18.0-240.10.1.el8_3.

Lustre-commit: 3d7db0c442d4b8edca7710ccf3a641ed1982a485
Lustre-change: https://review.whamcloud.com/41349

Test-Parameters: trivial fstype=ldiskfs \
clientdistro=el8.3 serverdistro=el8.3 testlist=sanity

Test-Parameters: trivial fstype=zfs \
clientdistro=el8.3 serverdistro=el8.3 testlist=sanity

Change-Id: I50a2cd22aa5bcd3a91745ffd36bc77e2d66d481b
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/41899
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
4 years agoLU-14099 build: Fix for unconfigured arch_stackwalk
Shaun Tancheff [Thu, 11 Mar 2021 23:45:11 +0000 (15:45 -0800)]
LU-14099 build: Fix for unconfigured arch_stackwalk

On aarch64 CONFIG_ARCH_STACKWALK is not defined and
print_stack_trace is not available.

Replace print_stack_trace with an open-coded variant
using %pB introduced in Linux v2.6.38-6557-g0f77a8d37825

This also fixes the symbols lookup of stack_trace_save_tsk
using kallsyms at module init time over the use of
symbol_get.

Lustre-commit: 58ac9d3f1844701f68444ecf6228e92a575809c4
Lustre-change: https://review.whamcloud.com/40503

HPE-bug-id: LUS-9518
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I04c3a0a84bb1a05d813a90502d1ed0f5bb2e33ab
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/42014
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoEX-773 lustre: Support RDMA only pages
Amir Shehata [Thu, 6 Feb 2020 04:23:20 +0000 (20:23 -0800)]
EX-773 lustre: Support RDMA only pages

Some memory architectures and CPU-offload cards with
on-board memory do not map data pages into the CPU
address space. Allow RDMA of data directly into those
pages without accessing contents.

Therefore, made changes to prevent doing checksum on
these type of pages.

Signed-off-by: Wang Shilong <wshilong@ddn.com>
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I189c34893ffa500ed275f2a1f79e8fb817a2489d
Reviewed-on: https://review.whamcloud.com/37454
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/42002
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
4 years agoEX-773 lnet: add LNet GPU Direct Support
Amir Shehata [Thu, 6 Feb 2020 03:14:17 +0000 (19:14 -0800)]
EX-773 lnet: add LNet GPU Direct Support

This patch exports registration/unregistration functions
which are called by the NVFS module to let the LND know
that it can call into the NVFS module to do RDMA mapping
of GPU shadow pages.

GPU priority is considered during NI selection.

Less than 4K writes are always RDMAed if the rdma source is
the gpu device

The dma mapping function provided by the GPU Direct driver
returns < 0 on failure, which is not in keeping with the kernel
provided mapping function, which returns 0 on failure.

The code changed slightly to handle the non-standard return code.

Also properly handle mapping error in the standard code path.
If the ib_dma_map_sg() returns 0, then there is no need
to go through the rest of the rd processing, just return an
error

When RDMA mapping failure occurs mark the failure with a
unique errno, EHWPOISON. Record that error in the message
event. When the message is finalized and the event is
propagated to the ptlrpc layer, if the mapping error has
occurred then flag the request not to be resent. This is
to avoid cases when Lustre enters into an RPC resend loop
without a way to terminate the loop.

RDMA mapping errors are assumed to be fatal and therefore
there is no point in retrying the request on the same memory

Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I2bfdbdd5fe3b8536e616ab442d18deace6756d57
Reviewed-on: https://review.whamcloud.com/37368
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-on: https://review.whamcloud.com/42001
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoEX-773 lnet: RMDA infrastructure updates
Amir Shehata [Thu, 6 Feb 2020 01:46:03 +0000 (17:46 -0800)]
EX-773 lnet: RMDA infrastructure updates

Add infrastructure to force RDMA for payloads < 4K.
Add infrastructure to extract the first page in a
payload. Useful for determining the type of the payload
to be transmitted.

Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: Id7dc26c83f00dadd26feca94fc4d8233872650d3
Reviewed-on: https://review.whamcloud.com/37453
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/42000
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
4 years agoLU-14488 o2ib: Use rdma_connect_locked if it is defined
Sergey Gorenko [Thu, 4 Mar 2021 12:33:16 +0000 (14:33 +0200)]
LU-14488 o2ib: Use rdma_connect_locked if it is defined

rdma_connect_locked() is added in the upstream kernel 5.10 and
MOFED-5.2-2. After that, it is not allowed to call rdma_connect()
in RDMA CM event handler; rdma_connect_locked() must be used
instead.

This commit adds configure checks to detect whether
rdma_connect_locked() is available and updates the event handler
to call the correct function.

Lustre-change: https://review.whamcloud.com/41887
Lustre-commit: 60d55e42ed9e043341790bf7624627c93cc99200

Test-Parameters: trivial
Signed-off-by: Sergey Gorenko <sergeygo@nvidia.com>
Change-Id: I8068d04810bf6f0200292a55f3fdcea8c71d44c1
Reviewed-on: https://review.whamcloud.com/41887
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/41999
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoEX-2771 lustre: add 80-es-multi-rail.conf
John L. Hammond [Thu, 4 Mar 2021 18:30:14 +0000 (12:30 -0600)]
EX-2771 lustre: add 80-es-multi-rail.conf

Add /etc/sysctl.d/80-es-multi-rail.conf with all the needed sysctl
settings for multi rail.

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: I98a1841b18f7edfaa9649de3a6bd84d516833220
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-on: https://review.whamcloud.com/41998
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-4684 tests: enable racer directory migration
Andreas Dilger [Thu, 28 Jan 2021 20:44:27 +0000 (13:44 -0700)]
LU-4684 tests: enable racer directory migration

Enable the dir_migrate test by default in racer test runs.

Update test selection logic to match newer script code style.

Lustre-change: https://review.whamcloud.com/41359
Lustre-commit: TBD (from 84cbbc51ef6c04fd62bc32a5f31dd2a2a2f47ebb)

Test-Parameters: trivial testlist=racer env=DURATION=3600
Test-Parameters: fstype=zfs testlist=racer env=DURATION=600
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ifba84c64b30d90b4a159232751b68c48c88dafcc
Reviewed-on: https://review.whamcloud.com/41963
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-14478 ldiskfs: support Ubuntu 20.04.1 kernel 5.4.0-66
Jian Yu [Fri, 12 Mar 2021 00:17:53 +0000 (16:17 -0800)]
LU-14478 ldiskfs: support Ubuntu 20.04.1 kernel 5.4.0-66

This patch fixes the conflict in ext4-pdirop.patch to support
Ubuntu 20.04.1 server with kernel version greater than or
equal to 5.4.0-66.

Lustre-commit: d507f8648615d3e015cf4b20494bba734ab8a323
Lustre-change: https://review.whamcloud.com/41786

Test-Parameters: trivial

Change-Id: I336f5bb430f87aaefc6d79a782dfd779d20e0cf7
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-on: https://review.whamcloud.com/42015
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoLU-14279 test: fix block soft testing failure
Wang Shilong [Mon, 28 Dec 2020 02:33:24 +0000 (10:33 +0800)]
LU-14279 test: fix block soft testing failure

Soft least qunit was introduced to avoid performance
drop when users have reached soft limit, but timer has
not reached, it tried to acquire more space(not more than
least qunit) to get reasonable performance.

Test cases need be aware of this, which means slave might
exceed quota limit a bit(but should not more than least qunit
eg 4M).

Lustre-change: https://review.whamcloud.com/41094
Lustre-commit: a71382df0204fe2cd465eba3873574118f46622b

Test-Parameters: trivial testlist=sanity-quota env=ONLY="3a 3b 3c"
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Change-Id: Ia221d97d158a8da4dc1fe1611aebac2f5086440e
Reviewed-on: https://review.whamcloud.com/41094
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-on: https://review.whamcloud.com/41997
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoLU-14462 gss: fix support for namespace in lgss_keyring
Sebastien Buisson [Mon, 22 Feb 2021 15:24:11 +0000 (00:24 +0900)]
LU-14462 gss: fix support for namespace in lgss_keyring

Fix the way lgss_keyring handles different mount namespaces,
so that we do not try to bind to a namespace that does not exist.

Lustre-change: https://review.whamcloud.com/41716
Lustre-commit: 0ab950762cdec28636b6033c50a2e563c28ba954

Fixes: 94c44c62de ("LU-7845 gss: support namespace in lgss_keyring")
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Ia5a5213399decc683d5e9401b6594e7fe579123f
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/41965
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-14388 utils: always enable ldiskfs project quota
Andreas Dilger [Sat, 30 Jan 2021 19:42:36 +0000 (12:42 -0700)]
LU-14388 utils: always enable ldiskfs project quota

Always enable project quota for newly-formatted ldiskfs filesystems.

Lustre-change: https://review.whamcloud.com/41370
Lustre-commit: 79642e08969eb4455bd8e23574b76f0a84d4db23

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I1b0f745bc04b5c42592bcc4fd9823d068fef2a79
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-on: https://review.whamcloud.com/41964
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-14268 lod: fix layout generation inc for mirror split
Bobi Jam [Tue, 22 Dec 2020 05:58:41 +0000 (13:58 +0800)]
LU-14268 lod: fix layout generation inc for mirror split

Mirror split does not increase the layout generation properly.

Mirror split does not change FLR state of the file, even when it
contains 1 mirror afterwards, and FLR state should be LCM_FL_NONE
instead.

Lustre-commit: ffa858b1657145c7e3d9988291fbb1ef72b3b980
Lustre-change: https://review.whamcloud.com/41068

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I9c9621d67d901f2e9ca6ed3e0684cd308c396076
Reviewed-on: https://review.whamcloud.com/41068
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/41962
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoLU-14423 osd: recognize holes in osd_is_mapped()
Alex Zhuravlev [Thu, 11 Feb 2021 14:33:01 +0000 (17:33 +0300)]
LU-14423 osd: recognize holes in osd_is_mapped()

ldiskfs_fiemap() can return {0,0,0} for last non-allocated
region.  osd_is_mapped() should be able to recognize and
cache this state.

Lustre-change: https://review.whamcloud.com/41481
Lustre-commit: 2eaa49ef0f16798d564883b16cea9e96fad52495

Fixes: 144b5a65c1 ("LU-7132 osd-ldiskfs: speedup rewrites")
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I03883038c2c0ec84754377a442c4947c7e3021a9
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-on: https://review.whamcloud.com/41960
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-12125 mds: allow parallel directory rename
Andreas Dilger [Thu, 14 Jan 2021 23:43:34 +0000 (16:43 -0700)]
LU-12125 mds: allow parallel directory rename

Allow rename of subdirectories in the same parent directory to be
done in parallel, by only taking the DLM lock on the parent FID,
without locking the global LUSTRE_BFL_FID (Big Filesystem Lock).

There will still be proper serialization from the parent directory
FID lock for other rename operations affecting that directory or the
subdirectories themselves.  Since the subdirectories are known to be
within the same parent, there is no concern of "finding the parent"
to determine locking order.

The same compatibility rules apply as with parallel file renames.

We no longer need the target file type in this case, since the
source and target file type are verfied to be the same under lock.
We may again need a file type check for regular file renames across
different parent directories/shards, so it may still be useful.

Lustre-change: https://review.whamcloud.com/41230
Lustre-commit: TBD (from 88dd4ff29b84e2ab0e6155410c3cbcc52f6e82be)

Test-Parameters: testlist=racer env=DURATION=3600
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I9db8acb515f5274fa09f5a5f3d18504d4d3ebbe5
Reviewed-on: https://review.whamcloud.com/41959
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoLU-12125 mds: allow parallel regular file rename
Andreas Dilger [Sat, 9 Jan 2021 09:08:06 +0000 (02:08 -0700)]
LU-12125 mds: allow parallel regular file rename

Allow rename of non-directory files in the same directory to be done
in parallel, by only taking the DLM lock on the parent FID, without
also locking the global LUSTRE_BFL_FID (Big Filesystem Lock).

Older clients may not send the renamed file mode in mds_rec_rename.
In this case, the LUSTRE_BFL_FID lock will still be taken, and is not
worse than before parallel rename was allowed.

Similarly, if (for whatever reason) there is a mix of MDS versions
running in the same filesystem, at worst older MDSes will continue to
unnecessarily lock LUSTRE_BFL_FID before doing the file rename.

If MDT0000 is on an older MDS, but newer MDSes are doing renames of
non-directories, the newer MDSes will *not* lock LUSTRE_BFL_FID first,
but there will still be proper serialization from the parent directory
FID lock for other renames affecting the parent and the source/target
entries.  That MDT0000 is unaware of the rename is the whole point.

In case of a race, where the file mode sent by the client is stale,
this is also not a concern, because the file mode is rechecked later
under lock and the rename fails if the source and target mode differ.

Lustre-change: https://review.whamcloud.com/41186
Lustre-commit: d76cc65d5d68ed3e04bfbd9b7527f64ab0ee0ca7

Test-Parameters: testlist=racer env=DURATION=3600
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: If330b53eb6db46e40f50fd7834a83e80db3ebbe5
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
Reviewed-on: https://review.whamcloud.com/41958
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoEX-585 tests: add hot-pools.sh for lamigo and lpurge tests
Jian Yu [Tue, 9 Mar 2021 07:37:28 +0000 (23:37 -0800)]
EX-585 tests: add hot-pools.sh for lamigo and lpurge tests

This patch adds hot-pools.sh test script to test lamigo and
lpurge utilities for Hot Pools feature.

Lustre-commit: a57e6a22dfec43fbcda5bc259c7be7b076f465c2
Lustre-change: https://review.whamcloud.com/39616

EX-1859 tests: sleep some time in hot-pools.sh test_6()

This patch fixes hot-pools.sh test_6() to sleep some time
before dumping lamigo stats and verifying chlg_user param.

Lustre-commit: 92fedefe242a1ec6d486929b6c355b6d2238e56b
Lustre-change: https://review.whamcloud.com/40532

EX-1860 tests: flush client cache in hot-pools.sh test_56()

This patch fixes hot-pools.sh test_56() to flush client cache
before resynchronizing the mirrored file and also shorten the
test running time by skipping checksum validation on files
larger than 2GB.

Lustre-commit: 6d4c30106bdef4aea13145f297dc92bc6f0d5bfa
Lustre-change: https://review.whamcloud.com/40627

EX-585 tests: improve hot-pools.sh to support multiple MDTs

This patch improves hot-pools.sh to support multiple MDSs and MDTs.

Lustre-commit: 5be8906ae03f6fb62185198a1a25ddb23b3bf29a
Lustre-change: https://review.whamcloud.com/40719

EX-2745 tests: add hot-pools.sh test 56 into except list

Temporarily add hot-pools.sh test 56 into except list because
the test caused EX-2745 and EX-2757.

Lustre-commit: c00b98d2e5e215923090fec7e8d7b24be07745e9
Lustre-change: https://review.whamcloud.com/41873

EX-2689 tests: initiate hot pools test env inside subtests

Before running hot pools tests, we need to register changelog user,
create OST pools and mount Lustre client on server node. Since
changelog user deregistration and OST pools destroy are called inside
changelog_register() and create_ost_pools() as trap commands on EXIT,
we need to initiate the test env inside subtests.

This patch also fixes hot-pools.sh subtests 4,7,8,10 to redirect
both stdout and stderr to a temporary debug file.

Lustre-commit: 5e9341303abe776a549971c70ee2705c482a1d71
Lustre-change: https://review.whamcloud.com/41820

Test-Parameters: trivial clientdistro=el8.3 testlist=hot-pools \
serverextra_install_params="--lipe-job lipe --lipe-build 0 -k zfs"

Test-Parameters: trivial testlist=hot-pools \
mdscount=2 mdtcount=4 \
serverextra_install_params="--lipe-job lipe --lipe-build 0 -k zfs"

Test-Parameters: trivial testlist=hot-pools,ost-pools,sanity-flr \
serverextra_install_params="--lipe-job lipe --lipe-build 0 -k zfs"

Change-Id: I03ca24d4e74a3152be6880b01683147965194f00
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/41842
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
4 years agoLU-13857 obdclass: Add white space to output valid YAML.
Lei Feng [Mon, 22 Feb 2021 02:06:03 +0000 (10:06 +0800)]
LU-13857 obdclass: Add white space to output valid YAML.

YAML needs a white space after the colon(:) between a pair of key and
value. In this case, if the integer is large enough, it will leave no
white space. So insert the white space forcefully.

Lustre-commit: 151f5322d30ec52a1b99c852e5adbdbbe6fc7e08
Lustre-change: https://review.whamcloud.com/41709

Change-Id: I366b5399cc293a66a70ea6084c6a5fa30a58813b
Signed-off-by: Lei Feng <flei@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/41709
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/41915
Tested-by: jenkins <devops@whamcloud.com>
4 years agoLU-13668 mdt: change lock mode for lease
Alex Zhuravlev [Wed, 17 Jun 2020 14:05:28 +0000 (17:05 +0300)]
LU-13668 mdt: change lock mode for lease

make it PW so that lfs getstripe and open-for-read do not
interrupt replication.

Lustre-change: https://review.whamcloud.com/38964
Lustre-commit: TBD (from a1738536459d9ea87b4d89e54cde624487fbc53d)

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I20f4bbbc4e7bf9055333aba1b8cca80aa899c664
Reviewed-on: https://review.whamcloud.com/41840
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoLU-14395 kernel: kernel update RHEL7.9 [3.10.0-1160.15.2.el7]
Jian Yu [Wed, 3 Mar 2021 01:41:12 +0000 (17:41 -0800)]
LU-14395 kernel: kernel update RHEL7.9 [3.10.0-1160.15.2.el7]

Update RHEL7.9 kernel to 3.10.0-1160.15.2.el7.

Change debuginfo download location since debuginfo.centos.org
does not provide kernel-debuginfo-common anymore.

The patch also reverts the following fix from RHEL 7.9 kernel
since version 3.10.0-1160.8.1.el7:

- [kernel] timer: Fix lockup in __run_timers() caused by
  large jiffies/timer_jiffies delta (Waiman Long) [1849716]

The above fix caused Hard LOCKUP kernel panic.

Test-Parameters: clientdistro=el7.9 serverdistro=el7.9
Change-Id: Icdd9e8bf4bd595dece266f6c5a9b0de344781a93
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/41902
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoEX-2616 kernel: add missing kernel patches back to rhel7.9
Li Dongyang [Tue, 2 Mar 2021 07:11:55 +0000 (18:11 +1100)]
EX-2616 kernel: add missing kernel patches back to rhel7.9

The patches were added for rhel7.7 but we still need them
for rhel7.9

Test-Parameters: serverdistro=el7.9
Change-Id: If84e08220e984019dbc71ea47c1202db7e5e70ac
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/41912
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoEX-1951 osc: workaround osc aio crash
John L. Hammond [Fri, 16 Oct 2020 21:36:53 +0000 (16:36 -0500)]
EX-1951 osc: workaround osc aio crash

This this avoids an aio related use after free on the object
associated to the page in osc_page_delete().

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: If2ad7d673bb2cce364982544f097d57ca28ccbe9
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/41843
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoLU-14180 utils: verify setstripe comp_end is valid
Andreas Dilger [Sun, 28 Feb 2021 23:51:14 +0000 (15:51 -0800)]
LU-14180 utils: verify setstripe comp_end is valid

Verify that the "lfs setstripe -E <component_end>" value is valid.
Otherwise, if "-S" is not specified at the same time, then an
invalid file layout can be created and the file cannot be deleted
normally, only via "lfs rmdif <FID>".

Allow values < 4096 (e.g. '64' or '128' which would all be invalid
anyway) to be interpreted as KiB units.

Update usage messages and man pages to match.

Lustre-change: https://review.whamcloud.com/41239
Lustre-commit: 83e38bba6237f838c9a5d7d36b258cf6dd28bd13

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Change-Id: I47fe7729ffd447c1c1cc098e5117e456263ebbe5
Reviewed-on: https://review.whamcloud.com/41790
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoRM-844 lbuild: add support for building kmod-lustre-o2ib-ofed
Wang Shilong [Wed, 4 Apr 2018 01:15:27 +0000 (09:15 +0800)]
RM-844 lbuild: add support for building kmod-lustre-o2ib-ofed

With this patch, we will be able to build ofed
and mlnx lustre RPMS at the same time.

Introduce extra option --enable-o2ib_kernel
to lbuild enable ofed RPMS, in default, it won't be built.

This is a quick walkaround solution that we could
support both MLNX and OFED KMOD RPMS, for the long
term, we should handle this more gracefully

Test-Parameters: trivial
Change-Id: I581535d2e13b5d412a9b6e057294eb146f4ef086
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/41875
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoEX-1428 kernel: contain update related to snapshot
Hongchao Zhang [Fri, 25 Sep 2020 10:27:04 +0000 (18:27 +0800)]
EX-1428 kernel: contain update related to snapshot

Adding extra fields in "struct jbd2_journal_handle" and
"struct journal_head", which are used by snapshot into the
4-byte hole at the end of struct jbd2_journal_handle so
that they do not increase the structure size and memory
usage for this common allocation.

Test-Parameters: trivial
Change-Id: I6bab8ae1cfcab39632cb1397ffd7991f72e00fb4
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/41882
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoRM-844 lbuild: add support for building kmod-lustre-o2ib-mlnx RPM
Wang Shilong [Sun, 1 Apr 2018 23:11:22 +0000 (07:11 +0800)]
RM-844 lbuild: add support for building kmod-lustre-o2ib-mlnx RPM

This patch try to add a new kmod RPM which is based
on MLNX_OFA Driver.

Test-Parameters: trivial
Change-Id: Ibea5b4bd43a62316d77e4b8de735b05dec3ccdf8
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/41874
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoLU-13970 tests: skip sanity test_427 on SLES12
Li Xi [Wed, 3 Mar 2021 09:26:42 +0000 (17:26 +0800)]
LU-13970 tests: skip sanity test_427 on SLES12

For SLES12 sanity.sh test_427 cannot determine the number of
objects in the lustre_inode_cache slab, so skip this old distro.

Test-Parameters: trivial testlist=sanity env=ONLY=427 clientdistro=sles12sp5
Fixes: acbc5b5bb2b5 ("LU-13970 llite: add option to disable Lustre inode cache")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ia6f2d57969d83d763ed91251dfaaf9929c3ebbe5
Reviewed-on: https://review.whamcloud.com/41851
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-14430 mdd: fix inheritance of big default ACLs
Mikhail Pershin [Fri, 12 Feb 2021 07:16:24 +0000 (10:16 +0300)]
LU-14430 mdd: fix inheritance of big default ACLs

If the number of default ACLs in directory is more than 31, then
mdd_acl_init() fails to inherit them for a newly created file.
This limitation is caused by using a fixed-size def_acl_buf buffer
in the mdd_create()->mdd_acl_init() call chain. Instead, the
default ACL buffer should be increased when it is needed.

Patch adds check for -ERANGE after mdd_acl_init(), reallocates
default ACL buffer with required size and calls mdd_acl_init()
again. Thus big default ACL are processed as expected.

Lustre-commit: f3d03bc38a3afdef83635d578ee0b2ffdd985685
Lustre-change: https://review.whamcloud.com/41494

Fixes: 6350af100c20 ("LU-3437 mdd: Fix ACL/def_ACL during object creation")
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I700da90c09f824955fcb8dc7ca0bc2f581f916a0
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/41861
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoLU-14375 kernel: kernel update SLES15 SP2 [5.3.18-24.46.1]
Li Xi [Fri, 5 Mar 2021 15:59:24 +0000 (23:59 +0800)]
LU-14375 kernel: kernel update SLES15 SP2 [5.3.18-24.46.1]

Update SLES15 SP2 kernel to 5.3.18-24.46.1 for Lustre client.

Test-Parameters: trivial \
env=SANITY_EXCEPT="100 130 136 817" \
clientdistro=sles15sp2 serverdistro=el7.9 \
testlist=sanity

Change-Id: I45ec236657ea4f54d7a08a8f0af8397cb161c4bf
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-on: https://review.whamcloud.com/41901
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoLU-14376 kernel: kernel update SLES12 SP5 [4.12.14-122.57.1]
Li Xi [Fri, 5 Mar 2021 15:57:07 +0000 (23:57 +0800)]
LU-14376 kernel: kernel update SLES12 SP5 [4.12.14-122.57.1]

Update SLES12 SP5 kernel to 4.12.14-122.57.1 for Lustre client.

Test-Parameters: trivial clientdistro=sles12sp5 \
env=SANITY_EXCEPT="56oc 817" testlist=sanity

Change-Id: I1ad5feb6f63cbaa948226fcb4248a2a767b67ce3
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/41900
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoLU-13659 kernel: kernel update SLES12 SP4 [4.12.14-95.54.1]
Li Xi [Fri, 5 Mar 2021 15:50:30 +0000 (23:50 +0800)]
LU-13659 kernel: kernel update SLES12 SP4 [4.12.14-95.54.1]

Update SLES12 SP4 kernel to 4.12.14-95.54.1 for Lustre client.

Test-Parameters: trivial clientdistro=sles12sp4 \
envdefinitions=LNET_SELFTEST_EXCEPT=smoke,SANITY_EXCEPT="103a 817"

Change-Id: If7b9143bec6d9c526bd65e96a771c83f2530e608
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/41898
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoEX-2596 build: add ddn tag to build MOFED
Minh Diep [Thu, 11 Feb 2021 05:12:58 +0000 (21:12 -0800)]
EX-2596 build: add ddn tag to build MOFED

We need to add ddn tag to MOFED kmod build
to indicate that kmod-mlnx-ofed_kernel is different
from the stock version

Test-Parameters: trivial

Change-Id: Ibc8b95c228afa921cedbc30d74196754e0f0cc24
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/41879
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoRM-849 lbuild: add mlnx fix patches
Wang Shilong [Wed, 20 Jun 2018 04:31:46 +0000 (12:31 +0800)]
RM-849 lbuild: add mlnx fix patches

we might need apply some mlnx patches and maintain
them by ourselves before official release include them.

[Updates: now we might not need mlnx 4.3, but let's keep the ability...]

Change-Id: Id357d581a602153db65c6c00d6475d01d4761c04
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/41878
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoRM-790 : fix Kernel string mismatch for ppc64le in RPMs
Gu Zheng [Mon, 4 Sep 2017 10:32:23 +0000 (06:32 -0400)]
RM-790 : fix Kernel string mismatch for ppc64le in RPMs

The macro in the kernel RPMs for ppc64 provided
by Redhat describing the kernel version supported,
mismatch the content of the macro of the Lustre kernel
client modules for ppc64le describing the required
kernel. This leads to unresolved dependency errors
during installation of the Lustre client modules
although the kernel versions match actually.

While Redhat uses the convention:

kernel = <major-number>-<minor-release-number>.el6

the following string is used in Lustre client RPM
for the required kernel version:

kernel = <major-number>-<minor-release-number>.el6.ppc64le

The patch will remove the trailing string '.ppc64le' in
macro 'krequires' defined in the lustre specfile
describing the software dependeny.

AUTOTEST:ONLY:sanity

[Shilong added autotest args]

Change-Id: Id1d9fd92928a567048d5ad6e9d0c872b5666b4e6
Signed-off-by: Gu Zheng <gzheng@ddn.com>
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/41877
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoLU-14398 hsm: use llapi_fid2path_at() in the copytool
John L. Hammond [Wed, 3 Feb 2021 20:19:05 +0000 (14:19 -0600)]
LU-14398 hsm: use llapi_fid2path_at() in the copytool

In lhsmtool_posix.c and liblustreapi_hsm.c, convert several uses of
uses of llapi_fid2path() to llapi_fid2path_at().

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: Ice64d02010b4260287be4d4e26c6b75b178bc81b
Reviewed-on: https://review.whamcloud.com/41865
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoLU-14398 lfs: use llapi_fid2path_at() in lfs_fid2path()
John L. Hammond [Wed, 3 Feb 2021 20:17:11 +0000 (14:17 -0600)]
LU-14398 lfs: use llapi_fid2path_at() in lfs_fid2path()

Use llapi_fid2path_at() in lfs_fid2path(). This avoids resolving and
opening the mount point for each FID argument passed. Make the -c,
--cur, --current option actually print the link. Add a more
descriptive long option name for this (--print-link). Update the
lfs-fid2path manpgae accordingly.

Lustre-change: https://review.whamcloud.com/41407

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: If851e4ce95f87d3188b644eb4a345ba3cfca530d
Reviewed-on: https://review.whamcloud.com/41864
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoLU-14398 llapi: add llapi_fid2path_at()
John L. Hammond [Wed, 3 Feb 2021 19:06:16 +0000 (13:06 -0600)]
LU-14398 llapi: add llapi_fid2path_at()

Add llapi_fid2path_at() which works like llapi_fid2path() takes an
open FD on the moint point instead of a 'fsname or dirirectory path'
and a const struct lu_fid * instead of a const char *.

Lustre-commit: c45558bf560cf43d440af5679b86ba7e4d2542f3
Lustre-change: https://review.whamcloud.com/41406

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: I76234bc28de231587b65c5d866954441e0893aac
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/41863
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoLU-14398 llapi: simplify llapi_fid2path()
John L. Hammond [Wed, 3 Feb 2021 18:33:26 +0000 (12:33 -0600)]
LU-14398 llapi: simplify llapi_fid2path()

Simplify llapi_fid2path(). Remove the fid_is_sane() check. Remove the
call to root_ioctl() and use get_root_path() directly.

Lustre-commit: 4cfe77df6f2499effa1644e6ad5a594abb11be23
Lustre-change: https://review.whamcloud.com/41405

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: Ib70b8c9e239c77da8b46408de8341fc8aaf4d1c3
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/41862
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoLU-14436 tgt: only use T10PI guard when doing full sector read
Li Dongyang [Tue, 16 Feb 2021 12:40:05 +0000 (23:40 +1100)]
LU-14436 tgt: only use T10PI guard when doing full sector read

The T10PI guard was generated on full sectors, if we
do we partial read and still use the guard, the rpc
checksum won't match.

Lustre-commit: f44413717eaf5cb938d9c9b2b62d312f064d282a
Lustre-change: https://review.whamcloud.com/41677

Test-Parameters: trivial
Change-Id: I40d481d703a46b9711021a162208b86a956bd8d1
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/41860
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-14435 doc: include lfs-flushctx manpage inside packages
Sebastien Buisson [Tue, 16 Feb 2021 08:58:25 +0000 (17:58 +0900)]
LU-14435 doc: include lfs-flushctx manpage inside packages

lfs manpage redirects to lfs-flushctx(1), so it has to be
included in the Lustre packages.

Lustre-change: https://review.whamcloud.com/41676
Lustre-commit: ece23db121d94c8194fada3cf0d0d1f9d9beeed7

Test-Parameters: trivial
Fixes: c246a9ba04 ("LU-14263 gss: unlink revoked key")
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I5c55f8a74eb6dac20fa85b6ea0663ad701341006
Reviewed-on: https://review.whamcloud.com/41859
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoLU-14444 gss: handle empty reqmsg in sptlrpc_req_ctx_switch
Sebastien Buisson [Thu, 18 Feb 2021 11:03:31 +0000 (20:03 +0900)]
LU-14444 gss: handle empty reqmsg in sptlrpc_req_ctx_switch

In sptlrpc_req_ctx_switch(), everything is already there to handle
the case of a ptlrpc_request that has an empty rq_reqmsg.
But assertions were left over at the beginning of the function, so
just remove them from here.

Lustre-change: https://review.whamcloud.com/41685
Lustre-commit: dfe87b089b662663ba125769866c98e803f89a8c

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I6ae1f8b9da9600d3b57b9efc9018c2461114f2fe
Reviewed-on: https://review.whamcloud.com/41858
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoLU-14455 mdt: fix DoM lock prolong logic
Mikhail Pershin [Fri, 19 Feb 2021 20:50:54 +0000 (23:50 +0300)]
LU-14455 mdt: fix DoM lock prolong logic

- don't stop at the first found lock if it is not PW or EX lock
- add LCK_GROUP lock as valid mode of lock to check

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: If947f8565008953cc34146b6f0ac1e0f0a038bb5
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-on: https://review.whamcloud.com/41857
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoLU-13514 tests: replace nid in conf-sanity test_32
Yang Sheng [Wed, 4 Nov 2020 18:36:43 +0000 (02:36 +0800)]
LU-13514 tests: replace nid in conf-sanity test_32

Need replace_nid for test_32a. Else the mdc cannot
be initialzed and prevent client mounting hung.

Lustre-change: https://review.whamcloud.com/40537
Lustre-commit: 327c8b77694bb0796f168df26e0c543d9610691e

Test-Parameters: trivial
Test-Parameters: env=ONLY=32a,ONLY_REPEAT=20 fstype=ldiskfs testlist=conf-sanity
Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: I651f5728ad4ff96a309ed599490c9dd6ed9c5274
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/41856
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoEX-2010 scsi: requeue aborted commands instead of retry
Trung Nguyen [Mon, 9 Nov 2020 06:46:44 +0000 (23:46 -0700)]
EX-2010 scsi: requeue aborted commands instead of retry

If the underlying SCSI command returns an abort, rather than retry
it quickly in a loop, which can finish within a few milliseconds,
requeue it with delay so that the hardware has a chance to recover.

The command requeue will take several seconds each time and allows
more chance for the problem to be resolved at the SCSI layer instead
of returning an error to the filesystem and causing server failover.

Test-Parameters: trivial testlist=sanity
Signed-off-by: Trung Nguyen <trunguyen@ddn.com>
Change-Id: Ibdf1b3a52dd0a1b388c7f5f97aa7a51620138845
Tested-by: Shuichi Ihara <sihara@ddn.com>
Reviewed-on: https://review.whamcloud.com/41852
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-13970 llite: add option to disable Lustre inode cache
Lai Siyao [Fri, 18 Sep 2020 09:53:17 +0000 (17:53 +0800)]
LU-13970 llite: add option to disable Lustre inode cache

A tunable option is added to disable Lustre inode cache:
"llite.*.inode_cache=0" (default =1)

When it's turned off, ll_drop_inode() always returns 1, then the last
iput() will release inode.

Add sanity 427.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I0642bdc694dc365a05395c3fae98131e1e7723c6
Reviewed-on: https://review.whamcloud.com/41850
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoLU-13397 lfs: mirror resync to keep sparseness
Mikhail Pershin [Wed, 25 Nov 2020 16:05:05 +0000 (19:05 +0300)]
LU-13397 lfs: mirror resync to keep sparseness

Use SEEK_HOLE/SEEK_DATA in llapi_mirror_resync_many() to
copy just data chunks between components. Holes at the last
component are done with truncate(), holes in other components
are done with fallocate(FALLOC_FL_PUNCH_HOLE). In case of any
punch() error the hole is just copied via read(), i.e. as zeroes

Currently fallocate(FALLOC_FL_PUNCH_HOLE) is not supported yet,
so resync preserves sparseness only for last components

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: Id249739c5cd2d1c8a998da3341d326de1a8b8d32
Reviewed-on: https://review.whamcloud.com/41849
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoLU-13397 lfs: mirror extend/copy keeps sparseness
Mikhail Pershin [Mon, 23 Nov 2020 11:06:12 +0000 (14:06 +0300)]
LU-13397 lfs: mirror extend/copy keeps sparseness

- make ll_lseek() to work under group lock and on designated
  mirror
- enhance lfs mirror copy functions migrate_copy_data() and
  llapi_mirror_copy_many() with lseek() to find holes and copy
  only data chunks.

Both 'migrate' and 'copy' lfs functionality rewrite designated
mirror fully, so holes are not punched in destination file, but
truncate is called first to make sure old data is erased.

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: Ic4a8768b816c921acd7f0adb3311138caac05a7c
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-on: https://review.whamcloud.com/41848
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoDDN-675 virtio: fix high-order kmalloc() failures
Andreas Dilger [Sat, 17 Oct 2020 07:02:12 +0000 (01:02 -0600)]
DDN-675 virtio: fix high-order kmalloc() failures

virtio_scsi needs high-order (64KB) allocations for large SCSI
requests in atomic context.  The __GFP_HIGH mask is intended for
such uses and should not have been removed from the mask.

Add a memory pool for these allocations in case of failure under
fragmentation to allow the IO to complete.  The default number
of items in the mempool is 16, but it can be tuned at module load
time via a parameter in /etc/modprobe.d/lustre.conf:

    options virtio_ring vring_desc_pool_sz=N

This avoids the need for excessive memory reservation that was
previously set with "vm.min_free_kbytes=1048576" and similar.

The __GFP_NOWARN flag is added to avoid scary stack dumps, and
instead a brief error message is printed periodically so that
it is possible to track whether there are still allocation
issues, but not so verbose as to cause undue alarm.

 ll_ost_io01_053: page allocation failure: order:4, mode:0x104000
 CPU: 5 PID: 946 Comm: ll_ost_io01_053 3.10.0-862.9.1.el7_lustre.ddn1
 Call Trace:
    __alloc_pages_nodemask+0x9b4/0xbb0
    alloc_pages_current+0x98/0x110
    __get_free_pages+0xe/0x40
    kmalloc_order_trace+0x2e/0xa0
    __kmalloc+0x211/0x230
    virtqueue_add+0x1c4/0x4d0 [virtio_ring]
    virtqueue_add_sgs+0x87/0xa0 [virtio_ring]
    virtscsi_add_cmd+0x17a/0x270 [virtio_scsi]
    virtscsi_kick_cmd+0x38/0xa0 [virtio_scsi]
    virtscsi_queuecommand+0x15d/0x340 [virtio_scsi]
    virtscsi_queuecommand_multi+0x6e/0xe0 [virtio_scsi]
    scsi_dispatch_cmd+0xb0/0x240
    scsi_queue_rq+0x5a5/0x6f0
    blk_mq_dispatch_rq_list+0x96/0x640
    blk_mq_do_dispatch_ctx+0xe0/0x160
    blk_mq_sched_dispatch_requests+0x138/0x1c0
    __blk_mq_run_hw_queue+0xa2/0xb0
    __blk_mq_delay_run_hw_queue+0x9d/0xb0
    blk_mq_run_hw_queue+0x14/0x20
    blk_mq_sched_insert_requests+0x64/0x80
    blk_mq_flush_plug_list+0x19c/0x200
    blk_flush_plug_list+0xce/0x230
    blk_finish_plug+0x14/0x40
    osd_do_bio.isra.25+0x651/0x8d0 [osd_ldiskfs]
    osd_write_commit+0x3fc/0x8d0 [osd_ldiskfs]
    ofd_commitrw_write+0xffe/0x1c90 [ofd]
    ofd_commitrw+0x4c9/0xae0 [ofd]
    obd_commitrw+0x2f3/0x336 [ptlrpc]
    tgt_brw_write+0xffd/0x17d0 [ptlrpc]
    tgt_request_handle+0x92a/0x1370 [ptlrpc]
    ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc]
    ptlrpc_main+0xa92/0x1e40 [ptlrpc]

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I35d5ed6d0d83648e1b7f625a4f3c4c8a333ebbe5
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-on: https://review.whamcloud.com/41845
Tested-by: jenkins <devops@whamcloud.com>
4 years agoLU-12029 utils: disable max_sectors_kb autotuning
Andreas Dilger [Tue, 14 Jul 2020 19:58:06 +0000 (13:58 -0600)]
LU-12029 utils: disable max_sectors_kb autotuning

Disable automatic max_sectors_kb tuning via mount.lustre by default.
This conflicts in EXA with tune_devices.sh that has SFA-specific
tuning parameters from the tune_devices.sh script.

Disable l_tunedisk in /etc/udev/rules.d/99-lustre-server.rules by
default for EXA so that it does not conflict with the EXA script.

Allow shrinking max_sectors_kb if explicitly set via mount option.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I58cf548d08f8680ec5d6ffd00e936a5d903ebbe5
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-on: https://review.whamcloud.com/41839
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoEX-994 kernel: backport a virtio bug fix to lustre kernel
Wang Shilong [Wed, 8 Apr 2020 13:04:13 +0000 (21:04 +0800)]
EX-994 kernel: backport a virtio bug fix to lustre kernel

backport following commit to RHEL7 kernel, as the bug
is frequently hit on heavy testing:

scsi: virtio: Reduce BUG if total_sg > virtqueue size to WARN.

If using indirect descriptors, you can make the total_sg as large as you
want.  If not, BUG is too serious because the function later returns
-ENOSPC.

Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Linux-commit: 44ed8089e991a60d614abe0ee4b9057a28b364e4

Change-Id: I237b5cfc4215093346224c8d0ba8a69541bf7694
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/38230
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/41837
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoEX-687 build: fix to generate lustre version on ubuntu
Wang Shilong [Fri, 1 Nov 2019 08:03:07 +0000 (16:03 +0800)]
EX-687 build: fix to generate lustre version on ubuntu

Problem is dash doesn't support >& operaton and will report
as Syntax error: Bad fd number.

Test-parameters: trivial

Signed-off-by: Wang Shilong <wshilong@ddn.com>
Change-Id: Iafc95756c4981798fe68dd4c51e2d6418335b7dd
Reviewed-on: https://review.whamcloud.com/41831
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoRM-620: add ddn tags to RPMs
Li Xi [Wed, 9 Oct 2019 15:23:21 +0000 (23:23 +0800)]
RM-620: add ddn tags to RPMs

The RPMs should have formats like following:

2.12.2_ddn0_334_g460f956

Change-Id: Ic7dd5290b4e9b439c8513150e3813ddeb23f1dfd
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Signed-off-by: Li Xi <lixi@ddn.com>
Reviewed-on: https://review.whamcloud.com/41830
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoDDN-941 test: relax sanity 450 memory checking
Wang Shilong [Fri, 10 Apr 2020 10:25:11 +0000 (18:25 +0800)]
DDN-941 test: relax sanity 450 memory checking

For debugging purpose, it could be helpful to output
memory usage.

Memory checking somehow not working, before figuring it
out, let's only output it rather than error to make
we don't block other patches landing.

Test-Parameters: trivial
Change-Id: I76dcac18824f2fe3b2f8d1a86abe3fd6cba40a3e
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-on: https://review.whamcloud.com/41836
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoEX-852 ldiskfs: adjust default block allocation threshold
Wang Shilong [Wed, 4 Mar 2020 02:34:46 +0000 (10:34 +0800)]
EX-852 ldiskfs: adjust default block allocation threshold

Use default 25% 15% might be too aggressive, especially
with DDN OST will be 1PB soon, let's adjust it to 15% and 10%,
also limit the max space to 15TB,10TB

Change-Id: I472160fc37fa8d119f13084f00bb22ea55c5ca18
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/41835
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoEX-752 kernel: backport two patches for E2EDI
Wang Shilong [Fri, 20 Dec 2019 02:21:41 +0000 (10:21 +0800)]
EX-752 kernel: backport two patches for E2EDI

From 40423bfb35bb6057bcefe93b738b5a9411e037a2 Mon Sep 17 00:00:00 2001
From: Jianchao Wang <jianchao.w.wang@oracle.com>
Date: Tue, 12 Feb 2019 09:56:25 +0800
blk-mq: insert rq with DONTPREP to hctx dispatch list
 when requeue

When requeue, if RQF_DONTPREP, rq has contained some driver
specific data, so insert it to hctx dispatch list to avoid any
merge. Take scsi as example, here is the trace event log (no
io scheduler, because RQF_STARTED would prevent merging),

   kworker/0:1H-339   [000] ...1  2037.209289: block_rq_insert: 8,0 R 4096 () 32768 + 8 [kworker/0:1H]
scsi_inert_test-1987  [000] ....  2037.220465: block_bio_queue: 8,0 R 32776 + 8 [scsi_inert_test]
scsi_inert_test-1987  [000] ...2  2037.220466: block_bio_backmerge: 8,0 R 32776 + 8 [scsi_inert_test]
   kworker/0:1H-339   [000] ....  2047.220913: block_rq_issue: 8,0 R 8192 () 32768 + 16 [kworker/0:1H]
scsi_inert_test-1996  [000] ..s1  2047.221007: block_rq_complete: 8,0 R () 32768 + 8 [0]
scsi_inert_test-1996  [000] .Ns1  2047.221045: block_rq_requeue: 8,0 R () 32776 + 8 [0]
   kworker/0:1H-339   [000] ...1  2047.221054: block_rq_insert: 8,0 R 4096 () 32776 + 8 [kworker/0:1H]
   kworker/0:1H-339   [000] ...1  2047.221056: block_rq_issue: 8,0 R 4096 () 32776 + 8 [kworker/0:1H]
scsi_inert_test-1986  [000] ..s1  2047.221119: block_rq_complete: 8,0 R () 32776 + 8 [0]

(32768 + 8) was requeued by scsi_queue_insert and had RQF_DONTPREP.
Then it was merged with (32776 + 8) and issued. Due to RQF_DONTPREP,
the sdb only contained the part of (32768 + 8), then only that part
was completed. The lucky thing was that scsi_io_completion detected
it and requeued the remaining part. So we didn't get corrupted data.
However, the requeue of (32776 + 8) is not expected.

Suggested-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
[gedwards@ddn.com: s/RQF_DONTPREP/REQ_DONTPREP/]

From e0c0d06892051e17ee58e041259f339d0c194804 Mon Sep 17 00:00:00 2001
From: "Martin K. Petersen" <martin.petersen@oracle.com>
Date: Fri, 26 Sep 2014 19:20:06 -0400
block: Don't merge requests if integrity flags differ

We'd occasionally merge requests with conflicting integrity flags.
Introduce a merge helper which checks that the requests have compatible
integrity payloads.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
[gedwards@ddn.com: only compare INTEGRITY bi_flags since we don't have bip_flags yet]

Change-Id: Iece6117afa7b16c86e682e8f44b702ac79ea3609
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-on: https://review.whamcloud.com/41833
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoEX-603 ioctl: LL_IOC_FID2MDTIDX on server mount point
Alex Zhuravlev [Wed, 16 Oct 2019 14:38:12 +0000 (17:38 +0300)]
EX-603 ioctl: LL_IOC_FID2MDTIDX on server mount point

add LL_IOC_FID2MDTIDX ioctl support on server's mount point.
this way lpurge can lookup MDS# by FID w/o Lustre client, so
lpurge won't need Lustre client on OSTs.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: If3c8c96e75573b812688686a331a38250826cd05
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Reviewed-on: https://review.whamcloud.com/41832
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoRM-633 ldiskfs: add loadbitmaps to load block bitmaps for rhel7.7
Wang Shilong [Wed, 4 Mar 2020 01:35:12 +0000 (09:35 +0800)]
RM-633 ldiskfs: add loadbitmaps to load block bitmaps for rhel7.7

Change-Id: I02bffab5eb809b2d8945562ad9b42f04929df380
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-on: https://review.whamcloud.com/41829
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoRM-633 ldiskfs: add loadbitmaps to load block bitmaps
Li Xi [Wed, 9 Oct 2019 15:26:07 +0000 (23:26 +0800)]
RM-633 ldiskfs: add loadbitmaps to load block bitmaps

Loading bitmaps help performaces for DDN SFA 4K sector
platform. small read during write will drop whole
throughoutput.

Change-Id: Icc02c5f16c181417ab3833ffd612cf6c1287eeb5
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/41828
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 years agolbuild: seperate kernel tag from lustre version
Wang Shilong [Sat, 12 May 2018 06:46:16 +0000 (14:46 +0800)]
lbuild: seperate kernel tag from lustre version

Bumped Kernel tags using different tags like Lustre.
As Lustre changed more frequently than Kernel changes.

With this change, we only bumped DDN kernel tag
if there is really some kernel changes here.

Change-Id: Icdebe62fe1ef5de2096ae0f568494db394dc04f7
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/41827
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoEX-2674 kernel: add kernel config file for RHEL 8.3 aarch64
Jian Yu [Fri, 26 Feb 2021 17:54:17 +0000 (09:54 -0800)]
EX-2674 kernel: add kernel config file for RHEL 8.3 aarch64

This patch adds the kernel config file to support
RHEL 8.3 aarch64 server.

Test-Parameters: trivial \
clientdistro=el8.3 serverdistro=el8.3 \
clientarch=aarch64 serverarch=aarch64 \
testlist=sanity

Change-Id: Id63b4ed0622701bb4fb4ab00e38ecb7e0c6ab7ec
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/41780
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Tested-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoEX-2439 build: Add opa-src option to lbuild
Minh Diep [Wed, 27 Jan 2021 19:58:40 +0000 (11:58 -0800)]
EX-2439 build: Add opa-src option to lbuild

Create an option to pass in the opa tarball directly,
since we can not download directly from Intel site
anymore

Test-Parameters: trivial

Lustre-change: https://review.whamcloud.com/41334
Lustre-commit: 3a1a94db3c84c94ad99b8437f482aebfeb6b6246

Change-Id: I3fcad6ee7eff35d26ad7659b6148b0df67c48ad0
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/41749
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoNew release 2.14.0 2.14.0 v2_14_0
Oleg Drokin [Fri, 19 Feb 2021 19:28:17 +0000 (14:28 -0500)]
New release 2.14.0

Change-Id: I2eb99af8fbeaab80b6614e427b77949b1225b406
Signed-off-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-14345 misc: update e2fsprogs to 1.45.6.wc5 33/41433/3
Andreas Dilger [Mon, 8 Feb 2021 11:56:25 +0000 (04:56 -0700)]
LU-14345 misc: update e2fsprogs to 1.45.6.wc5

Update Changelog to reference new e2fsprogs release.

4aea203f LU-5949 e2fsck: call delete_inode() properly
8725134d LU-5949 e2fsck: simplify inode badness handling
71b74579 LU-14345 e2fsck: fix check of directories over 4GB

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ie0845eeed410f6f9f8ef985342fc19d160aa8cb0
Reviewed-on: https://review.whamcloud.com/41433
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Artem Blagodarenko <artem.blagodarenko@hpe.com>
Reviewed-by: Peter Jones <pjones@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoNew RC 2.14.0-RC3 2.14.0-RC3 v2_14_0-RC3
Oleg Drokin [Sat, 13 Feb 2021 00:52:28 +0000 (19:52 -0500)]
New RC 2.14.0-RC3

Change-Id: I594b5c6d0da7f067bef69fa7a7027374d4434dd8
Signed-off-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-14424 Revert "LU-9679 osc: simplify osc_extent_find()" 98/41498/2
Oleg Drokin [Fri, 12 Feb 2021 15:55:42 +0000 (10:55 -0500)]
LU-14424 Revert "LU-9679 osc: simplify osc_extent_find()"

It looks like there are performance regressions atttributed to this patch.

This reverts commit 80e21cce3dd6748fd760786cafe9c26d502fd74f.

Change-Id: I55e0abd50573dd82a9d216f9c3b01483f99c3223
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/41498
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
4 years agoNew release candidate 2.14.0 RC2 2.14.0-RC2 v2_14_0-RC2
Oleg Drokin [Mon, 8 Feb 2021 22:13:56 +0000 (17:13 -0500)]
New release candidate 2.14.0 RC2

Change-Id: Iad3d71e7dcf96173d192717ef4fef3f0dc12b051
Signed-off-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-13751 tests: remove read of changelog sanity 160j 17/41317/7
James Nunez [Tue, 26 Jan 2021 01:15:49 +0000 (18:15 -0700)]
LU-13751 tests: remove read of changelog sanity 160j

sanity test 160j tries to read the changelog after one of two
client mounts is unmounted.  In this case, we can fail to read
the changelog and get a "Cannot send after transport endpoint
shutdown" error.

The intention of sanity test 160j is to check that
there is no LBUG due to missed obd device.  So, do not try to
read from the changelog after file system unmount.

Test-Parameters: trivial testlist=sanity env=ONLY=160j,ONLY_REPEAT=200
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: I1746a422b25d546b9aae38ae8438d9c08bce8827
Reviewed-on: https://review.whamcloud.com/41317
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
4 years agoLU-14355 ptlrpc: do not output error when imp_sec is freed 10/41310/2
Sebastien Buisson [Mon, 25 Jan 2021 08:24:19 +0000 (17:24 +0900)]
LU-14355 ptlrpc: do not output error when imp_sec is freed

There is a race condition on client reconnect when the import is being
destroyed.  Some outstanding client bound requests are being processed
when the imp_sec has already been freed.
Ensure to output the error message in import_sec_validate_get() only
if import is not already in the zombie work queue.

Fixes: 135fea8fa9 ("LU-4423 obdclass: use workqueue for zombie management")
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I4b431128e04f11b1e3ee7de47090af87538c3558
Reviewed-on: https://review.whamcloud.com/41310
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-14299 test: sleep to enable quota acquire again 89/41389/3
Hongchao Zhang [Fri, 29 Jan 2021 20:51:43 +0000 (04:51 +0800)]
LU-14299 test: sleep to enable quota acquire again

sanity-quota test_61 fails with incorrect quota exceeded
errors because quota acquire will be disabled for 5 seconds
after edquot flag is set.  The test should introduce some
delay between the test of over quota and normal one.

Test-Parameters: trivial fstype=zfs testlist=sanity-quota env=ONLY=61,ONLY_REPEAT=20
Fixes: 530881fe4ee20 ("LU-7816 quota: add default quota setting support")
Change-Id: I8040ba960f32cf01cb7cee3a77c06ad4bd732f0e
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/41389
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-13449 tests: fix recovery-small test_140b check 09/39909/2
Andreas Dilger [Mon, 14 Sep 2020 23:07:17 +0000 (17:07 -0600)]
LU-13449 tests: fix recovery-small test_140b check

The recovery timer is printed in MM:SS format, but the current test
is unhappy if the SS part is printed as "08" or "09" since that is
interpreted by bash as an invalid octal number.  Also, the current
check does not handle the case if recovery is longer than a minute.

Change the code to convert MM:SS back to seconds for the comparison.

Test-Parameters: trivial testlist=recovery-small env=ONLY=140
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ie1dc77a88bb0e8fd5025f2b5ca57d4a61d3ebbe5
Reviewed-on: https://review.whamcloud.com/39909
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-14316 llite: quiet spurious ioctl warning 27/41427/2
Andreas Dilger [Fri, 5 Feb 2021 20:13:10 +0000 (13:13 -0700)]
LU-14316 llite: quiet spurious ioctl warning

Calling "lfs setstripe" prints a suprious warning about using the old
ioctl(LL_IOC_LOV_GETSTRIPE) when that is not actually the case.

Remove the ioctl warning for now and deal with related issues later.

Fixes: 364ec95f3688 ("LU-9367 llite: restore ll_file_getstripe in ll_lov_setstripe")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I20f5a7adb60a30fce27e49827bd46229e2ce7057
Reviewed-on: https://review.whamcloud.com/41427
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>