Whamcloud - gitweb
fs/lustre-release.git
3 years agoLU-13514 tests: replace nid in conf-sanity test_32 37/40537/6
Yang Sheng [Wed, 4 Nov 2020 18:36:43 +0000 (02:36 +0800)]
LU-13514 tests: replace nid in conf-sanity test_32

Need replace_nid for test_32a. Else the mdc cannot
be initialzed and prevent client mounting hung.

Test-Parameters: trivial
Test-Parameters: env=ONLY=32a,ONLY_REPEAT=20 fstype=ldiskfs testlist=conf-sanity
Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: I651f5728ad4ff96a309ed599490c9dd6ed9c5274
Reviewed-on: https://review.whamcloud.com/40537
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoNew RC 2.12.6-RC1 2.12.6-RC1 v2_12_6-RC1
Oleg Drokin [Fri, 13 Nov 2020 23:08:12 +0000 (18:08 -0500)]
New RC 2.12.6-RC1

Change-Id: Ie881983730549b47c21668caf43f478fc92667a7
Signed-off-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13839 kernel: new kernel [RHEL 8.3 4.18.0-240.1.1.el8_3] 58/40558/3
Jian Yu [Sat, 7 Nov 2020 00:11:42 +0000 (16:11 -0800)]
LU-13839 kernel: new kernel [RHEL 8.3 4.18.0-240.1.1.el8_3]

This patch makes changes to support new RHEL 8.3 release
for Lustre client.

Test-Parameters: trivial clientdistro=el8.3

Change-Id: I06a46735b42ac258e576b1dd5c0beb17f4fd3e47
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/40558
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-14116 autoconf: check if DES3 enctype is supported 60/40560/2
Jian Yu [Fri, 6 Nov 2020 09:31:27 +0000 (01:31 -0800)]
LU-14116 autoconf: check if DES3 enctype is supported

krb5 releases 1.18 and later completely remove support for
all DES3 enctypes (des3-cbc-raw, des3-hmac-sha1, des3-cbc-sha1-kd).

This patch adds HAVE_DES3_SUPPORT to check if DES3 enctype
is supported.

Change-Id: Ibb51ec7961e8c775ea92dec6119f4de01e2d9b1d
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/40560
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
3 years agoLU-13519 osd-ldiskfs: expand inode project quota for upgrading 04/40404/10
Wang Shilong [Wed, 6 May 2020 04:45:25 +0000 (12:45 +0800)]
LU-13519 osd-ldiskfs: expand inode project quota for upgrading

When upgrading filesystem, it is possible that inode
it not big enough to hold project id field, and in that case
set project ID will return EOVERFLOW error.

Since ldiskfs have the logic to expand inode size automatically,
we could add similar logic for project quota.

Considering this as an rare case, we just call
ldiskfs_mark_inode_dirty() which will try to expand instead
of exporting more functions.

Lustre-change: https://review.whamcloud.com/38505
Lustre-commit: 57108489a3eb2ff6fc3994dbda0649ae445d6cb7

Change-Id: I941f33ce8f45d2015acc0a33c5b54cf3a771a452
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/40404
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Reviewed-by: Stephane Thiell <sthiell@stanford.edu>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13969 tests: Updates to lustre-release yaml.sh 02/40402/2
Lee Ochoa [Mon, 26 Oct 2020 16:58:16 +0000 (10:58 -0600)]
LU-13969 tests: Updates to lustre-release yaml.sh

Updated output of release() function to standarize node.yml
file os_distribution parameter. Changes as follows:

RHEL   - use redhat-release first and os-release as backup
         as the latter may not include the full version
         (major/minor)
CENTOS - use centos-release first and os-release as backup,
         same as RHEL
SUSE   - use os-release instead of suse-release as the latter
         is deprecated
UBUNTU - use os-release

Removed parsing system-release and *-release as neither
option correctly outputs desired info

Removed "lustre_" references in node.yml file attributes,
the default in Maloo is to look for non-lustre prefixes
first.

Lustre-commit: f90199b104984da5f2157e39a286d433b725ed57
Lustre-change: https://review.whamcloud.com/39952

Change-Id: Ia011f944aae53f31fcd3a539e846ea5aba7ec7c4
Signed-off-by: Lee Ochoa <lochoa@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/40402
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-13687 llite: return -ENODATA if no default layout 99/40499/2
Andreas Dilger [Sat, 27 Jun 2020 11:14:02 +0000 (05:14 -0600)]
LU-13687 llite: return -ENODATA if no default layout

Don't return -ENOENT if fetching the default layout from the root
directory fails.  Otherwise, "lfs find" will print an error message
for every directory scanned in the filesystem:

     lfs find: /myth/tmp does not exist: No such file or directory

Lustre-change: https://review.whamcloud.com/39200
Lustre-commit: 7fb17eb7b7e6035931987ae1e9589639114d210e

Fixes: 3e8fa8a7396c ("LU-11656 llite: fetch default layout for a directory")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I5e082c5d425c44ca7770d3b24cbb13bb7d2540e5
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/40499
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-12662 tests: Add new pjdfstest into tests 53/38653/8
Wei Liu [Tue, 20 Aug 2019 18:59:36 +0000 (11:59 -0700)]
LU-12662 tests: Add new pjdfstest into tests

Create a new POSIX test suite based on pjdfstest.

This is a back port from
Lustre-change: https://review.whamcloud.com/35841
Lustre-commit: 414e613c2da55e6b8d2b3b20cbfb340cd84c9854

Test-Parameters: trivial
Test-Parameters: fstype=ldiskfs testlist=pjdfstest
Test-Parameters: fstype=zfs testlist=pjdfstest

Change-Id: Iec37e2248ce5ccf89319aaffb3ead9b407ad1931
Signed-off-by: Wei Liu <sarah@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/38653
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13949 build: add autogen.sh into distribution tarball 66/40466/2
Jian Yu [Thu, 29 Oct 2020 18:07:03 +0000 (11:07 -0700)]
LU-13949 build: add autogen.sh into distribution tarball

This patch adds autogen.sh and config/lustre-version.m4 into
Lustre distribution tarball so that customers can regenerate
aclocal.m4, config.h.in, autoMakefile.in and configure in
their build environments.

Change-Id: Ic6c5430b9a8b504ebc6a7618e141f1ea23b046a2
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/40466
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Shuichi Ihara <sihara@ddn.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13514 tests: remove upgrade images for conf-sanity 92/40492/2
James Nunez [Fri, 19 Jun 2020 18:01:42 +0000 (12:01 -0600)]
LU-13514 tests: remove upgrade images for conf-sanity

conf-sanity test 32a is hanging at a high rate.  We need to
explore if the issue involves old images are having problems
upgrading to the latest version of Lustre.

Test-Parameters: trivial
Test-Parameters: env=ONLY=32a,ONLY_REPEAT=20 fstype=ldiskfs testlist=conf-sanity
Test-Parameters: env=ONLY=32 fstype=zfs testlist=conf-sanity
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: I0ff1e9e1304192b1008551b82133d95a0010c86a
Reviewed-on: https://review.whamcloud.com/39109
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/40492
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13437 llite: pass name in getattr by FID 82/40482/2
Lai Siyao [Mon, 12 Oct 2020 14:22:07 +0000 (22:22 +0800)]
LU-13437 llite: pass name in getattr by FID

Now parent FID is packed in getattr_by_FID request
(see https://review.whamcloud.com/39290), it should also pass in name
from llite, so that lmv can replace fid1 with stripe FID, otherwise
MDS may treat sub files under striped directory as remote object.

Note, the name is not packed in request, because if it's packed, MDS
will getattr by name instead of FID.

Lustre-change: https://review.whamcloud.com/40219
Lustre-commit: 90ebab5833007defd91e86f5878f356ae5304a1b

Fixes: 5f2c44bf6 ("LU-13437 llite: pack parent FID in getattr")
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: If8215667bcb10ea3c4c5cd2c9034d81fd1cda3b5
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/40482
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-13437 mdc: remote object support getattr from cache 51/40451/2
Lai Siyao [Sat, 10 Oct 2020 14:34:19 +0000 (22:34 +0800)]
LU-13437 mdc: remote object support getattr from cache

For historical reason, IT_GETATTR lock revalidate matches
LOOKUP|UPDATE|PERM lock bits because for MDS < 2.4, permission is
protected by LOOKUP lock, but this will cause remote object not
able to match the cached lock because LOOKUP and UPDATE lock are
fetched separately.

Add sanity 803b, and rename 803 to 803a.

Lustre-change: https://review.whamcloud.com/40218
Lustre-commit: 72a1ca996e3a35ce3e4b7e517f77ff7ac83ccdd5

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I3ac38fe34472736849307bb7f1eebb5de9343a5c
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/40451
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13692 ldlm: Ensure we reprocess the resource on ast error 12/40412/5
Oleg Drokin [Fri, 7 Aug 2020 07:38:51 +0000 (03:38 -0400)]
LU-13692 ldlm: Ensure we reprocess the resource on ast error

When we are trying to grant a lock and met an AST error, rerunning
the policy is pointless since it cannot grant a potentially now eligible
lock and our lock is already in all the queues, just be like all the other
handlers for ERESTART return and run a full resource reprocess instead.

Lustre-change: https://review.whamcloud.com/#/c/39598/
Lustre-commit: 24e3b5395bc61333a32b1e9725a0d7273925ef05

Change-Id: I3edb37bf084b2e26ba03cf2079d3358779c84b6e
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/40412
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-11719 ldlm: Adjust search_* functions 99/40399/2
Patrick Farrell [Mon, 3 Dec 2018 16:36:08 +0000 (10:36 -0600)]
LU-11719 ldlm: Adjust search_* functions

The search_itree and search_queue functions should both
return either a pointer to a found lock or NULL.

Currently, search_itree just returns the contents of
data->lmd_lock, whether or not a lock was found.

search_queue will do the same under certain cirumstances.

Zero lmd_lock in both search_* functions, and also stop
searching in search_itree once a lock is found.

cray-bug-id: LUS-6783
Signed-off-by: Patrick Farrell <paf@cray.com>
Change-Id: Ie231166756e60c228370f8f1a019ccfe14dfda6a
Reviewed-on: https://review.whamcloud.com/33754
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/40399
Tested-by: jenkins <devops@whamcloud.com>
3 years agoLU-12014 llite: check correct size in ll_dom_finish_open() 01/40301/2
Mikhail Pershin [Wed, 19 Dec 2018 19:28:53 +0000 (22:28 +0300)]
LU-12014 llite: check correct size in ll_dom_finish_open()

The check in ll_dom_finish_open() for data end shouldn't
use i_size for comparision because it may be not updated
yet with just returned data from server. Use size value in
mdt_body from reply for that check.

Lustre-change: https://review.whamcloud.com/33895
Lustre-commit: 7b9fd576f7de7d4bfa40c85d06bb224e7a29c829

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I1104fbbb0eb4633869b9bf2d1803ac3e84e3853d
Reviewed-on: https://review.whamcloud.com/40301
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-12296 llite: improve ll_dom_lock_cancel 96/40296/3
Vladimir Saveliev [Wed, 5 Jun 2019 01:46:42 +0000 (04:46 +0300)]
LU-12296 llite: improve ll_dom_lock_cancel

ll_dom_lock_cancel() should zero kms attribute similar to
mdc_ldlm_blocking_ast0().

In order to avoid code duplication between mdc_ldlm_blocking_ast0()
and ll_dom_lock_cancel() - add new cl_object_operations method -
coo_object_flush() to reach mdc's blocking ast from llite level.

Tests illustrating the issue are added.

Lustre-change: https://review.whamcloud.com/34858
Lustre-commit: 707bab62f5d6c704b30e4ee9e769b5c9f026e1e7

LU-12704 lov: check all entries in lov_flush_composite

Check all layout entries for DOM layout and exit with
-ENODATA if no one exists. Caller consider that as valid
case due to layout change.

Define llo_flush methods for all layouts as required
by lov_dispatch().

Patch cleans up also cl_dom_size field in cl_layout which
was used in previous ll_dom_lock_cancel() implementation

Run lov_flush_composite under down_read lov->lo_type_guard to avoid
race with layout change.

Lustre-change: https://review.whamcloud.com/36368
Lustre-commit: 44460570fd21a91002190c8a0620923125135b52

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I2b100ead6d420dbf561bc61be973d64dad317214
Reviewed-on: https://review.whamcloud.com/40296
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-14069 ldlm: Fix unbounded OBD_FAIL_LDLM_CANCEL_BL_CB_RACE wait 11/40411/3
Oleg Drokin [Fri, 23 Oct 2020 06:56:04 +0000 (02:56 -0400)]
LU-14069 ldlm: Fix unbounded OBD_FAIL_LDLM_CANCEL_BL_CB_RACE wait

in ldlm_handle_cp_callback the while loop is clearly supposed
to be limited by the "to" value of 1 second, but is not.
Seems to have been broken by all the Solaris porting in HEAD
all the way back in 2008.
Restore the to assignment to make it not hang indefinitely.

Lustre-change: https://review.whamcloud.com/#/c/40375/
Lusre-commit: 5da99051e58b9e9079b66a275d6c47e1e109eee5

Change-Id: I449bfd7f585ab7db475fb3fd4cbbd876126ff789
Fixes: adde80ffef ("Land b_head_libcfs onto HEAD")
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/40411
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-13719 lov: doesn't check lov_refcount 52/40452/2
Hongchao Zhang [Fri, 21 Aug 2020 10:17:12 +0000 (18:17 +0800)]
LU-13719 lov: doesn't check lov_refcount

In lov_cleanup, the check of each OSC is protected by
lov_tgt_getrefs, which will increment the "lov_refcount",
so the "lov_refcount" shouldn't be checked inside because
it is always larger than 0.

Change-Id: I21423d4345190b3e02eb00734c127e35cbc9b1af
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/39702
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/40452

3 years agoLU-13636 osd: create agent inode with explicit owner 03/40403/2
Alex Zhuravlev [Fri, 5 Jun 2020 05:16:32 +0000 (08:16 +0300)]
LU-13636 osd: create agent inode with explicit owner

to avoid quota misaccounting.

Lustre-change: https://review.whamcloud.com/38842
Lustre-commit: 7805b45f1182ed21198c0cd2000ffe93b7de5340

Test-Parameters: fstype=ldiskfs
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I5a02e6e7de71821a10704ac3516ee087998c9c21
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/40403
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-13919 kernel: kernel update RHEL7.8 [3.10.0-1127.19.1.el7] 93/39993/5
Jian Yu [Mon, 26 Oct 2020 18:22:54 +0000 (11:22 -0700)]
LU-13919 kernel: kernel update RHEL7.8 [3.10.0-1127.19.1.el7]

Update RHEL7.8 kernel to 3.10.0-1127.19.1.el7.

Test-Parameters: trivial clientdistro=el7.8 serverdistro=el7.8

Change-Id: I7d0cbdb32b33f2f8121fec707924c35fa086f965
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/39993
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13477 lnet: Force full discovery cycle 77/39577/9
Amir Shehata [Wed, 5 Aug 2020 19:34:10 +0000 (12:34 -0700)]
LU-13477 lnet: Force full discovery cycle

There are scenarios where there could be a discrepancy between
cached peer information and reality. In these cases what could
end-up happening is incomplete interface information might be
cached because one side determined that the peer didn't require
a PUSH. This will lead to undesired MR behavior, where not all
the interfaces are used for a period of time.

Therefore, it is safer to always force a full discovery cycle:
GET/PUSH to ensure both sides are up-to-date.

In the NMR case, when discovery is turned off, make sure to flag
discovery as complete to avoid stalling the state machine.

Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: Ie49ad11e8ff874206baa268a4ef2d58ebb536ed5
Lustre-change: https://review.whamcloud.com/38322
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/39577
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-10756 ptlrpc: fix IMP_CLOSED state is being never set 21/38621/5
Mikhail Pershin [Mon, 3 Feb 2020 09:03:59 +0000 (12:03 +0300)]
LU-10756 ptlrpc: fix IMP_CLOSED state is being never set

Commit cf78502e48d checks the new state for IMP_CLOSED value
instead of import current state so instead of keeping import
closed it prevents import state from being set to IMP_CLOSE

Patch restores original check to keep import closed by
checking its current state

Fixes: cf78502e48d ("LU-10756 ptlrpc: change IMPORT_SET_* macros into real functions")
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I7df2798f09ce7023381c03957adf530da4149c2d
Reviewed-on: https://review.whamcloud.com/37405
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
(cherry picked from commit 43dddbd0785d4da14714390d802bf6ec65567350)
Reviewed-on: https://review.whamcloud.com/38621

3 years agoLU-13464 target: abort recovery if timer fail 03/40303/2
Hongchao Zhang [Mon, 19 Oct 2020 18:52:56 +0000 (11:52 -0700)]
LU-13464 target: abort recovery if timer fail

During target recovery, the recovery timer should be kept to be
armed to ensure the recovery doesn't take too long time, there
should be some problem if the deadline of the recovery timer is
passed and the recovery is not completed yet, the recovery should
be aborted in this case.

Lustre-commit: 87443d9c27e8535c3e17d6bf142ad68d4449b93f
Lustre-change: https://review.whamcloud.com/38277

Change-Id: Id44f2a2d1a3183ad8dd13f4d34392713c55a2cb3
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/40303
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-14012 lod: properly initialize lcm in lod_layout_convert() 06/40306/2
John L. Hammond [Tue, 20 Oct 2020 00:40:18 +0000 (17:40 -0700)]
LU-14012 lod: properly initialize lcm in lod_layout_convert()

In lod_layout_convert() zero out lcm and lcme before constructing the
converted layout.

Lustre-commit: 6f2a1c911f0a326765e6d11f35bb602daf057948
Lustre-change: https://review.whamcloud.com/40153

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: I40f96d51cb63816a9bfc34217f02ff7c450de974
Reviewed-on: https://review.whamcloud.com/40306
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13511 obdclass: don't initialize obj for zero FID 04/40304/2
Lai Siyao [Mon, 19 Oct 2020 19:09:45 +0000 (12:09 -0700)]
LU-13511 obdclass: don't initialize obj for zero FID

Object with zero FID is used in stripe allocation, and it's
meaningless to initialize such object via lu_object_find_at(),
return error early to avoid assertion in lu_object_put().

Lustre-commit: 22ea9767956c89aa08ef6d80ad04aaccde647755
Lustre-change: https://review.whamcloud.com/39792

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Ia1bda3d01ff7552e94f31a9c928868652937d559
Reviewed-on: https://review.whamcloud.com/40304
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Stephane Thiell <sthiell@stanford.edu>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-12233 lnet: deadlock on LNet shutdown 71/40171/2
Serguei Smirnov [Wed, 7 Oct 2020 22:13:31 +0000 (18:13 -0400)]
LU-12233 lnet: deadlock on LNet shutdown

Release ln_api_mutex during LNet shutdown while waiting
for zombie LNI to allow other threads to read the LNet
state updated by the shutdown and fall through, avoiding
the deadlock

Lustre-change: https://review.whamcloud.com/39933
Lustre-commit: e0c445648a38fb72cc426ac0c16c33f5183cda08

Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: If0886f1bc4412dd9cacb08a0f06fa69aeeed1c5b
Reviewed-on: https://review.whamcloud.com/40171
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13892 lnet: lock-up during router check 72/40172/2
Serguei Smirnov [Wed, 7 Oct 2020 22:51:06 +0000 (18:51 -0400)]
LU-13892 lnet: lock-up during router check

This is a fix for the issue with LNet lock-up while waiting
for routers to become active with check_routers_before_use
option. Release ln_api_mutex while waiting to allow
incoming connections to be handled.

Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: I63b1d1ce5ee2b27a3bd2cea78713fc6fc7502cf7
Reviewed-on: https://review.whamcloud.com/40172
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-10949 mdt: lost reference on mdt_md_root 76/39976/5
Andriy Skulysh [Wed, 20 Feb 2019 10:48:03 +0000 (12:48 +0200)]
LU-10949 mdt: lost reference on mdt_md_root

mdt_remote_object_lock_try() drops object
reference in case of an error but if the
request was sent to a server it is decreased
again via failed_lock_cleanup()

Add ldlm_created_callback. It is called after
lock creation, so we can safely add a reference
to l_ast_data and drop it only in BL AST handler.

Lustre-commit: b2368774a01eb89981e2ceb92be9673e4b403d62
Lustre-change: https://review.whamcloud.com/34181

Cray-bug-id: LUS-7013
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Signed-off-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: Vitaly Fertman <c17818@cray.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Change-Id: I49c946278f379390634642370d15c7fe89441d86
Reviewed-on: https://review.whamcloud.com/39976
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-11276 ldlm: fix lock convert races 54/39854/2
Vitaly Fertman [Wed, 16 Oct 2019 16:07:56 +0000 (19:07 +0300)]
LU-11276 ldlm: fix lock convert races

The blocking cb may be triggered in parallel and the convert logic
of the DOM lock must be ready that the cancel_bits could be already
zeroed by the first executor.

As there may be several blocking cb parallel executors and several
conversion callers, each requesting for different inode bits, setup
the following logic:
- the lock keeps the aggregated set of bits requested for cancelling
  by different parties, where 0 means the whole lock is to be
  cancelled, and where the CBPENDING flag means there is a canceling
  job pending;
- once completed, the cancel_bits are zeroed and the CBPENDING flag
  is dropped, meaning the next request will be a part of the next job;
- once a local lock is converted, its state is changed appropriately
  and no cleanup is left for the interpret time as the lock is ready
  for the next usage;
- as the lock is unlocked in a process of conversion and more bits
  may appear, check it and repeat appropriately;
- let just 1 conversion executor to work at a time, others are waiting
  similar to ldlm_cli_cancel();
- there are others who may want to cancel unused locks (cancel_lru,
  cancel_resource_local), consider CANCELING as a request to cancel
  the full lock independently of the cancel_bits;

Some cleanups are done:
- move the cache drop logic to the CANCELING part of the blocking cb
  from the BLOCKING one;
- remove the convert RPC interpret, as the lock cleanups are already
  done in advance; the convert RPC is re-sendable and an error means
  there is a serioes net problem;

Test-Parameters: testlist=racer,racer,racer
Signed-off-by: Vitaly Fertman <c17818@cray.com>
Change-Id: I901de34241704ed801152f071cb7f610fe6f4bfe
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/39854
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Vitaly Fertman <vitaly.fertman@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13590 kernel: RHEL 7.9 server support 24/40224/2
Jian Yu [Mon, 12 Oct 2020 23:58:27 +0000 (16:58 -0700)]
LU-13590 kernel: RHEL 7.9 server support

This patch makes changes to support new RHEL 7.9 release
for Lustre server (kernel 3.10.0-1160.2.1.el7).

Test-Parameters: trivial clientdistro=el7.9 serverdistro=el7.9

Change-Id: I7653091f2bd6a579447edb12045984d2829a8235
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/40224
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13922 osd-ldiskfs: no need to add OI cache in readdir 35/40135/3
Lai Siyao [Sat, 29 Aug 2020 21:53:18 +0000 (05:53 +0800)]
LU-13922 osd-ldiskfs: no need to add OI cache in readdir

It's a waste of time to call osd_add_oi_cache() in osd_it_ea_rec(),
because each dirent read will override it.

Lustre-change: https://review.whamcloud.com/39782
Lustre-commit: bc5934632df10aaa02b32b8254a473c14c6f8104

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Iec701bf66153fdf2ba7a3f3b89565381215abf33
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Stephane Thiell <sthiell@stanford.edu>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/40135
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-12870 build: sanity-hsm test depends on libtool 22/38822/4
Minh Diep [Thu, 17 Oct 2019 14:11:09 +0000 (07:11 -0700)]
LU-12870 build: sanity-hsm test depends on libtool

Adding Ubuntu libtool-bin requirement

Lustre-change: https://review.whamcloud.com/36471
Lustre-commit: dbce727a3633ce03d24c28defce9a0ed6d1ef106)

Test-Parameters: trivial clientdistro=ubuntu1804 testlist=sanity-hsm

Change-Id: I04cfffc880259e4cf1c2cba142eddd47a95a736e
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/38822
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
3 years agoLU-12352 libcfs: crashes with certain cpu part numbers 94/37994/4
Andrew Perepechko [Thu, 17 Jan 2019 21:58:10 +0000 (00:58 +0300)]
LU-12352 libcfs: crashes with certain cpu part numbers

Due to a bug in the code, libcfs will crash if the
number of online cpus does not divide by the number
of cpu partitions.

Based on the checks in cfs_cpt_table_create(), it
appears that the original intent was to push the
remaining cpus into the initial partitions.

So let's do that properly.

Lustre-commit: e33e3da58972a811e6eafc479f95f6df2baf4b9b
Lustre-change: https://review.whamcloud.com/34991

Change-Id: I3c5e2aa1fdfca4c07e7afce143c984973373f009
Cray-bug-id: LUS-6455
Signed-off-by: Andrew Perepechko <c17827@cray.com>
Reviewed-by: Alexander Boyko <c17825@cray.com>
Reviewed-by: Alexander Zarochentsev <c17826@cray.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-on: https://review.whamcloud.com/37994
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13960 tests: correct usage of _var variable 85/39985/2
James Nunez [Sat, 12 Sep 2020 18:04:02 +0000 (12:04 -0600)]
LU-13960 tests: correct usage of _var variable

In the setmodopts() function in functions.sh, the '_var'
variable is set and used.  There is one use of the variable
'var' which should be '_var'.  Change the use of 'var' to
'_var'.

Reviewed-on: https://review.whamcloud.com/39891
(cherry picked from commit ff29ed8fe9c58bd2caa4d63bcbe7556e1c320703)

Test-Parameters: trivial
Test-Parameters: testlist=conf-sanity env=ONLY=53 clientdistro=ubuntu1804 fstype=ldiskfs
Test-Parameters: testlist=conf-sanity env=ONLY=53 clientdistro=el7.8 fstype=ldiskfs
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: If524be1f0b4b2170a514a558256a5308c9a5e586
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Vikentsi Lapa <vlapa@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/39985
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-13590 kernel: new kernel [RHEL 7.9 3.10.0-1160.2.1.el7] 77/40177/2
Jian Yu [Thu, 8 Oct 2020 18:13:45 +0000 (11:13 -0700)]
LU-13590 kernel: new kernel [RHEL 7.9 3.10.0-1160.2.1.el7]

This patch makes changes to support new RHEL 7.9 release
for Lustre client.

Test-Parameters: trivial clientdistro=el7.9

Change-Id: I7a2846de48a6710d6d720d6ccc3176dba4afc6bb
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/40177
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoLU-12820 osc: remove 'transient' arg from osc_enter_cache_try 18/39518/7
Mr NeilBrown [Sun, 29 Sep 2019 23:09:54 +0000 (09:09 +1000)]
LU-12820 osc: remove 'transient' arg from osc_enter_cache_try

This arg is always '0', so remove it.
Consequently, OBD_BRW_NOCACHE is never set, and
cl_dirty_transit and obd_dirty_transit_pages
are never non-zero, so they can be removed as well.

Lustre-change: https://review.whamcloud.com/36319
Lustre-commit: 524deb6f985beb512a4499501fd7275ecb77f815

Patch also includes changes for atomic ops optimization
to keep in sync with master branch:

Lustre-change: https://review.whamcloud.com/33859
Lustre-commit: 8b364fbd6bd9e0088440e6d6837861a641b923a0

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: Ia047affc33fb9277e6c28a8f6d7d088c385b51a8
Reviewed-on: https://review.whamcloud.com/39518
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13608 tgt: abort recovery while reading update llog 84/39284/6
Hongchao Zhang [Tue, 30 Jun 2020 11:22:10 +0000 (19:22 +0800)]
LU-13608 tgt: abort recovery while reading update llog

Abort the reading update LLOG fromt other MDTs when the recovery
is aborted, then the recovery process can be aborted in time.

This patch also adds watchdog for the process of the replay request
to detect possible stale process.

Lustre-change: https://review.whamcloud.com/38746
Lustre-commit: 0496cdf20451f07befebd1cb8a770544ec0f57df

Change-Id: Ie2de041360c9eba95ef9bfd14b00ac2709e6eace
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/38746
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/39284
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13437 llite: pack parent FID in getattr 71/39771/2
Lai Siyao [Mon, 6 Jul 2020 13:52:45 +0000 (21:52 +0800)]
LU-13437 llite: pack parent FID in getattr

Pack parent FID in getattr request if OBD_CONNECT2_GETATTR_PFID is
enabled, otherwise fill it with target FID for backward compatibility.

Lustre-change: https://review.whamcloud.com/39290
Lustre-commit: 5f2c44bf626b178503c1c4d2d85c40bae087ff4f

Fixes: f9a2da63 ("LU-13437 mdt: don't fetch LOOKUP lock for remot...")
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Change-Id: I91bace23e67b548feb92fd885fb5e64e92c96408
Reviewed-on: https://review.whamcloud.com/39771
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-13437 uapi: add OBD_CONNECT2_GETATTR_PFID 70/39770/2
Lai Siyao [Mon, 6 Jul 2020 13:03:59 +0000 (21:03 +0800)]
LU-13437 uapi: add OBD_CONNECT2_GETATTR_PFID

Add OBD_CONNECT2_GETATTR_PFID connect flag to pack parent FID in
getattr request, which will be used to check whether target is
remote object, if so, don't take LOOKUP lock, otherwise client
may see stale directory entries.

Lustre-change: https://review.whamcloud.com/39289
Lustre-commit: f384a8733c41e43ebc2db3c542287a700ace8cbb
Test-parameters: trivial

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Change-Id: Ibdf880934456f255f83cd4bac9d61ab5e1ed7330
Reviewed-on: https://review.whamcloud.com/39770
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-13437 mdt: rename misses remote LOOKUP lock revoke 01/39601/3
Lai Siyao [Wed, 8 Apr 2020 14:55:22 +0000 (22:55 +0800)]
LU-13437 mdt: rename misses remote LOOKUP lock revoke

In rename, all objects but target may be remote, so to check whether
source is remote object on source parent, we need to compare which
MDTs they are located if both are remote. Add a helper function
mdt_rename_source_lock() to handle all possible combinations. If target
parent is remote, take remote LOOKUP for target on where target parent
is.

Add sanityn.sh 81c.

Lustre-change: https://review.whamcloud.com/38181
Lustre-commit: 4918fe40db262b19093436caca688c75eb632496

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I2c134970d6abc8761528d01950b23495292cdf93
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/39601
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13437 mdt: don't fetch LOOKUP lock for remote object 69/39769/2
Lai Siyao [Sun, 10 May 2020 07:22:36 +0000 (15:22 +0800)]
LU-13437 mdt: don't fetch LOOKUP lock for remote object

Pack parent FID in getattr by FID, which will be used to check whether
child is remote object on parent. The helper function is called
mdt_is_remote_object(). NB, directory shard is not treated as remote
object, because if so, client needs to revalidate shards when dir is
accessed, which will hurt performance much.

For getattr by FID, if object is remote file on parent, don't fetch
LOOKUP lock, otherwise client may see stale dir entries.

Lustre-change: https://review.whamcloud.com/38561
Lustre-commit: f9a2da63abab5b8b687842166a0b5b5e434ad441

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Change-Id: I37b36983735eca63da37f190456b5cc1b861b29e
Reviewed-on: https://review.whamcloud.com/39769
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-13437 lmv: check stripe FID sanity 00/39600/2
Lai Siyao [Fri, 8 May 2020 14:53:47 +0000 (22:53 +0800)]
LU-13437 lmv: check stripe FID sanity

Striped directory layout may be broken, if some stripe FID is insane,
return -ENODEV.

Lustre-change: https://review.whamcloud.com/38560
Lustre-commit: 698a496aac51e11791717a9cbd0a86b3525f4557

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I7ed8c7c561e34625e2cb29bfd14bc0ecf3fce46c
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/39600
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13471 lnet: use the same src nid for discovery 76/39576/4
Amir Shehata [Thu, 23 Apr 2020 00:06:23 +0000 (17:06 -0700)]
LU-13471 lnet: use the same src nid for discovery

When discovering a remote peer (not on the same network) a GET is
sent to the peer to retrieve the peer's interfaces.  This is followed
by a PUSH, if discovery is on, to push the node's interfaces However,
if both node and peer have multiple interfaces it is likely that the
GET and the PUSH will originate on different interfaces. When the
peer receives the PUSH it will not be able to connect the two NIDs
and will not be able to consolidate the node's NIDs.  This issue is
specific for remote peers because at the time the push handler is
invoked the remote lpni has not been created yet. lnet_parse()
creates the lpni of the gateway.

Similar to the strategy already in place of using the same source NID
for all the messages of an RPC, discovery should use the same source
NID for both the GET and PUSH.

This patch stores the source NID interfaces the GET was sent on and
uses it for the PUSH.

Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I5a13ab7799b2ddc47714202bcbed786b0d3940b7
Reviewed-on: https://review.whamcloud.com/38320
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/39576

3 years agoLU-13907 llite: don't set FS_REQUIRES_DEV on client 74/39674/4
Andreas Dilger [Thu, 13 Aug 2020 22:18:52 +0000 (16:18 -0600)]
LU-13907 llite: don't set FS_REQUIRES_DEV on client

If doing a client-only build, do not set the FS_REQUIRES_DEV flag
for the 'lustre' filesystem type.  This is only needed on the server,
but the filesystem type declaration is shared between both.

In master, this was fixed by declaring a new 'lustre_tgt' filesystem
type and using that for server filesystem mounts.  However, for 2.12
this is overkill, and it is possible to get a 95% fix by dropping
the FS_REQUIRES_DEV flag for the common case of client-only builds.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@dilger.ca>
Change-Id: Iab2e78515aba018e2a6bceb324ad1b8a313ebbe5
Reviewed-on: https://review.whamcloud.com/39674
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13187 osd-ldiskfs: don't enforce max dir size limit on IAM objects 82/39882/3
Li Dongyang [Thu, 3 Sep 2020 23:34:34 +0000 (09:34 +1000)]
LU-13187 osd-ldiskfs: don't enforce max dir size limit on IAM objects

Add ext4-no-max-dir-size-limit-for-iam-objects.patch to introduce new
inode state EXT4_STATE_IAM and use it to mark IAM objects.

Lustre-change: https://review.whamcloud.com/39823
Lustre-commit: 03e6db505be90d35ccacb3af7e15277784e5d448

Change-Id: I3bcc5435ea07edb9fa265dcd8e3261d849495f00
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/39882
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13763 osc: don't allow negative grants 80/39380/4
Mikhail Pershin [Wed, 15 Jul 2020 05:42:49 +0000 (08:42 +0300)]
LU-13763 osc: don't allow negative grants

Add check in the osc_init_grant() to prevent possible
underflow of cl_avail_grant and report error if it happens

Lustre-change: https://review.whamcloud.com/#/c/39827
Lustre-commit: e05ccafd6ee214895d01efbb13a3757e3625a859

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: Idcd25ed427c23735e1cdc70359bace43b5b9d886
Reviewed-on: https://review.whamcloud.com/39380
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-12687 osc: consume grants for direct I/O 86/39386/11
Vladimir Saveliev [Mon, 29 Jun 2020 11:26:57 +0000 (14:26 +0300)]
LU-12687 osc: consume grants for direct I/O

New IO engine implementation lost consuming grants by direct I/O
writes. That led to early emergence of out of space condition during
direct I/O. The below illustrates the problem:
  # OSTSIZE=100000 sh llmount.sh
  # dd if=/dev/zero of=/mnt/lustre/file bs=4k count=100 oflag=direct
  dd: error writing â€˜/mnt/lustre/file’: No space left on device

Consume grants for direct I/O.

Try to consume grants in osc_queue_sync_pages() when it is called for
pages which are being writted in direct i/o.

Tests are added to verify grant consumption in buffered and direct i/o
and to verify direct i/o overwrite when ost is full.
The overwrite test is for ldiskfs only as zfs is unable to overwrite
when it is full.

Lustre-change: https://review.whamcloud.com/35896
Lustre-commit: 05f326a7988a7a0d6954d1b0d318315526209ae6

Fixes: 9fe4b52ad2 ("LU-1030 osc: new IO engine implementation")
Signed-off-by: Vladimir Saveliev <c17830@cray.com>
Change-Id: I9a199452c564e8e8ad02f79231e8481166f3666e
Cray-bug-id: LUS-7036
Reviewed-on: https://review.whamcloud.com/39386
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13761 o2ib: Fix compilation with MOFED 5.1 81/39781/2
Sergey Gorenko [Tue, 1 Sep 2020 06:53:06 +0000 (23:53 -0700)]
LU-13761 o2ib: Fix compilation with MOFED 5.1

A new argument was added to rdma_reject() in MOFED 5.1 and
Linux 5.8.

Add a cofigure check and support both versions of rdma_reject().

Lustre-commit: 956deb0fe8195c7a0c38c66a5a8cc1e95c2c245e
Lustre-change: https://review.whamcloud.com/39323

Test-Parameters: trivial
Signed-off-by: Sergey Gorenko <sergeygo@mellanox.com>
Change-Id: I2b28991f335658b651b21a09899b7b17ab2a9d57
Reviewed-on: https://review.whamcloud.com/39781
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13742 llite: do not bypass selinux xattr handling 71/39671/3
Shaun Tancheff [Wed, 5 Aug 2020 14:17:03 +0000 (09:17 -0500)]
LU-13742 llite: do not bypass selinux xattr handling

Without the hint from selinux_is_enabled() to determine if selinux
is running at boot the performance fix from LU-549 to skip handling
of selinux xattrs cannot be correctly handled.

The correct path is to act is if selinux is enabled.

This fixes a bug introduced by LU-12355 that now exists in
RHEL 8.2 kernels where clients have enabled selinux.

Lustre-change: https://review.whamcloud.com/39569
Lustre-commit: 994287bd47819ebd8badb716da4232cdff97d324

Fixes: 39e5bfa734 ("LU-12355 llite: include file linux/selinux.h removed")
Test-Parameters: clientdistro=el8.2 serverdistro=el8.2 clientselinux testlist=sanity-selinux
Test-Parameters: clientdistro=el8.1 serverdistro=el8.1 clientselinux testlist=sanity-selinux
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I6fb5ed9ecdb79545225b5586b90509eb157a355b
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/39671
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-13580 tests: fix retrieval of SELinux context 13/39713/2
Sebastien Buisson [Mon, 18 May 2020 09:43:22 +0000 (11:43 +0200)]
LU-13580 tests: fix retrieval of SELinux context

Use 'stat' command instead of 'ls -lZ' to retrieve SELinux security
context, to make it more portable.

Lustre-change: https://review.whamcloud.com/38648
Lustre-commit: ca09fda138b6d72588f40e4cf79c5f2de832d2dd

Test-Parameters: trivial clientselinux testlist=sanity-selinux mdtcount=2 clientcount=2
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I61bc0efb1e8ae0427d05827e2933eb0b848fb442
Reviewed-on: https://review.whamcloud.com/39713
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13278 lnet: Reconcile discovery push and reply handling 75/39575/2
Chris Horn [Mon, 10 Feb 2020 20:11:49 +0000 (14:11 -0600)]
LU-13278 lnet: Reconcile discovery push and reply handling

Reconcile the logic for updating the multi-rail flag of a peer when
processing a discovery PUSH with the logic used when processing a
discovery REPLY.

Cray-bug-id: LUS-8516
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: Idfb4c3729822d03b71f9440ac66176ae6b886022
Reviewed-on: https://review.whamcloud.com/37674
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Stephen Champion <stephen.champion@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/39575
Reviewed-by: Chris Horn <chris.horn@hpe.com>
3 years agoLU-13818 build: use libsnmp-dev instead of libsnmp30 79/39679/2
Minh Diep [Fri, 24 Jul 2020 17:38:04 +0000 (10:38 -0700)]
LU-13818 build: use libsnmp-dev instead of libsnmp30

Installing libsnmp-dev will pull in the correct libsnmpXX.
By depending on the libsnmp-dev we can install on
ubuntu 20.04 which is libsnmp35

Lustre-change: https://review.whamcloud.com/39506
Lustre-commit: af2f77633bf7b12d6ca1ab606ff90cf1ee58107a

Change-Id: Ib921ac35e06149ba88fa8e39b9a0980deb94acf2
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/39679
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13599 mdt: fix mti_big_lmm buffer usage 21/39521/2
Mikhail Pershin [Tue, 28 Jul 2020 11:33:18 +0000 (14:33 +0300)]
LU-13599 mdt: fix mti_big_lmm buffer usage

The mti_big_lmm buffer can be used just as temporary buffer
in some cases. It should drop mti_big_lmm_used flag after
that to avoid assertion in mdt_big_attr_get().

This fix is extracted from bigger patch of LU-11025 in
master branch.

Lustre-change: https://review.whamcloud.com/37284
Lustre-commit: a336d7c7c1cd62a5a5213835aa85b8eaa87b076a

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I3718d6c413ef1d5f8242e548868602ef6476006e
Reviewed-on: https://review.whamcloud.com/39521
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Stephane Thiell <sthiell@stanford.edu>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-9971 lnet: use after free in lnet_discover_peer_locked() 91/38891/6
Olaf Weber [Tue, 12 Sep 2017 12:07:50 +0000 (14:07 +0200)]
LU-9971 lnet: use after free in lnet_discover_peer_locked()

When the lnet_net_lock is unlocked, the peer attached to an
lnet_peer_ni (found via lnet_peer_ni::lpni_peer_net->lpn_peer)
can change, and the old peer deallocated. If we are really
unlucky, then all the churn could give us a new, different,
peer at the same address in memory.

Change the reference counting on the lnet_peer lp so that it
is guaranteed to be alive when we relock the lnet_net_lock for
the cpt. When the reference count is dropped lp may go away if
it was unlinked, but the new peer is guaranteed to have a
different address, so we can still correctly determine whether
the peer changed and discovery should be redone.

LU-9971 lnet: fix peer ref counting

Exit from the loop after peer ref count has been incremented
to avoid wrong ref count.

The code makes sure that a peer is queued for discovery at most
once if discovery is disabled. This is done to use discovery
as a standard ping for gateways which do not have discovery feature
or discovery is disabled.

Signed-off-by: Olaf Weber <olaf.weber@hpe.com>
Change-Id: Ia44dce20074b27ec0e77d7c1908c6a44ec73d326
Reviewed-on: https://review.whamcloud.com/28944
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/38891
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
3 years agoLU-13609 llog: list all the log files correctly on MGS/MDT 30/39330/4
Emoly Liu [Fri, 10 Jul 2020 05:05:00 +0000 (13:05 +0800)]
LU-13609 llog: list all the log files correctly on MGS/MDT

"lctl --device xxx llog_catlist" should list all the config log on
MGS and catalog on MDT correctly without any buffer size limit.
If data can't be fetched in one time, data->ioc_count is used to
save the number of all the fetched logs and then continue.

conf-sanity.sh test_123af is added to verify this patch. And the
minor style issue in LU-13757 is fixed as well.

Lustre-change: https://review.whamcloud.com/38917
Lustre-commit: 1d97a8b4cd3de9074f323332c7b736367a70d419

Signed-off-by: Emoly Liu <emoly@whamcloud.com>
Change-Id: I364d563446833751b1f017fa2bef0351dab56235
Reviewed-on: https://review.whamcloud.com/39330
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Ben Evans <beevans@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13667 ptlrpc: fix endless loop issue 44/39344/2
Hongchao Zhang [Fri, 19 Jun 2020 02:53:12 +0000 (10:53 +0800)]
LU-13667 ptlrpc: fix endless loop issue

In ptlrpc_pinger_main, if the process to ping the recoverable
clients or obd_update_maxusage takes too long time, it could
be stuck in endless loop because of the negative value returned
by pinger_check_timeout.

Lustre-change: https://review.whamcloud.com/38915
Lustre-commit: 6be2dbb2595121fabceda86c5f7bdcb45e10b320

Change-Id: Ib7fc22b3cc31255223bc2be60224ced1a3585f87
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/39344
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-12222 ptlrpc: Check if NID is local, not just lolnd NID 65/38865/2
Chris Horn [Mon, 27 Apr 2020 15:07:21 +0000 (10:07 -0500)]
LU-12222 ptlrpc: Check if NID is local, not just lolnd NID

There's a couple places where we check whether a NID is the lolnd NID
but we really want to know whether the NID is local. Use
LNetIsPeerLocal() to accomplish this.

Lustre-change: https://review.whamcloud.com/38388
Lustre-commit: 95bcc24642c4b95d093407fef0947ee2f5a2c01a

Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: Ia17b9b4b54fd1063c42a6f8bdd0e593be1086683
Reviewed-on: https://review.whamcloud.com/38865
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-12222 lnet: Primary NID of lolnd NID is the lolnd NID 64/38864/2
Chris Horn [Wed, 22 Apr 2020 16:42:27 +0000 (11:42 -0500)]
LU-12222 lnet: Primary NID of lolnd NID is the lolnd NID

We want Lustre traffic that is intended for the local peer to be sent
and received over the lolnd. The function ptlrpc_uuid_to_peer() will
currently resolve a NID to the lolnd NID, but ptlrpc_connection_get()
will overwrite this selection with the result from LNetPrimaryNID().

Have LNetPrimaryNID return the lolnd NID when it is passed the lolnd
NID.

Lustre-change: https://review.whamcloud.com/38313
Lustre-commit: 33d2e44e5026f1e9162dd5e6b931085fdc035a34

HPE-bug-id: LUS-8457
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I02708bb45f8440091782ca7886bac7656efb0223
Reviewed-on: https://review.whamcloud.com/38864
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoLU-12222 lnet: Introduce constant for the lolnd NID 63/38863/2
Chris Horn [Wed, 22 Apr 2020 16:39:46 +0000 (11:39 -0500)]
LU-12222 lnet: Introduce constant for the lolnd NID

This patch adds a new constant, LNET_NID_LO_0, to represent the lolnd
NID 0@lo.

Lustre-change: https://review.whamcloud.com/38312
Lustre-commit: 56203e4ba0a64789e42ea45946e8c51f1db351fb

HPE-bug-id: LUS-8457
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I3e57637f297b8de306905a447af8f025e31d1fcf
Reviewed-on: https://review.whamcloud.com/38863
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoLU-12758 quota: clear default flag for new ID 08/38808/2
Hongchao Zhang [Tue, 2 Jun 2020 16:20:47 +0000 (09:20 -0700)]
LU-12758 quota: clear default flag for new ID

When setting the quota limits as 0 by "lfs setquota", the default
flag won't be cleared if the lquota_entry is just created for some
quota ID at the first time because the quota limits are the same.

This patch is back-ported from the following one:
Lustre-commit: ce86e23b21ccffc395089578c0ca356de219ac88
Lustre-change: https://review.whamcloud.com/36236

Change-Id: I7f44ce0cb13783ca5bede2f55cd0707f1ccbc8ca
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/38808
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13659 kernel: kernel update SLES12 SP4 [4.12.14-95.54.1] 39/39239/3
Jian Yu [Thu, 2 Jul 2020 04:14:25 +0000 (21:14 -0700)]
LU-13659 kernel: kernel update SLES12 SP4 [4.12.14-95.54.1]

Update SLES12 SP4 kernel to 4.12.14-95.54.1 for Lustre client.

Test-Parameters: trivial clientdistro=sles12sp4 \
envdefinitions=LNET_SELFTEST_EXCEPT=smoke,SANITY_EXCEPT="103a 817"

Change-Id: If7b9143bec6d9c526bd65e96a771c83f2530e608
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/39239
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13599 mdt: fix logic of skipping local locks in reply_state 91/39191/4
Mikhail Pershin [Fri, 26 Jun 2020 15:17:06 +0000 (18:17 +0300)]
LU-13599 mdt: fix logic of skipping local locks in reply_state

The mdt_reint_migrate() controls amount of local locks taken and
prevent the saving too many locks in reply_state by doing local
sync instead. Meanwhile there is flaw in logic of doing that so
they are saved always causing assertion in ptlrpc_save_lock().

Patch adds 'do_sync' local parameter into consideration while
deciding to save local lock or not.

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I98cca84825ce5789094fbceb5d1f7975410d134b
Reviewed-on: https://review.whamcloud.com/39191
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Stephane Thiell <sthiell@stanford.edu>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-12424 lnet: prevent loop in LNetPrimaryNID() 90/38890/4
Amir Shehata [Tue, 11 Jun 2019 18:25:27 +0000 (11:25 -0700)]
LU-12424 lnet: prevent loop in LNetPrimaryNID()

If discovery is disabled locally or at the remote end, then attempt
discovery only once. Do not update the internal database when
discovery is disabled and do not repeat discovery.

This change prevents LNet from getting hung waiting for
discovery to complete.

Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I4543b0f71e6cf297a1a5f058ebcc6bf74b8ac328
Reviewed-on: https://review.whamcloud.com/35191
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Tested-by: Jenkins
Reviewed-by: Chris Horn <hornc@cray.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/38890
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
3 years agoLU-13149 tests: change sanityn 103 facet value 47/38847/3
James Nunez [Fri, 5 Jun 2020 15:15:01 +0000 (09:15 -0600)]
LU-13149 tests: change sanityn 103 facet value

The facet name input to lustre_version_code() in sanityn
test 103 should be 'ost1' not a variable '$ost1'.  Let's
replace this call with the $OST1_VERSION variable.

Fixes: 2548cb9e32bfca ("LU-11670 osc: glimpse - search for active lock")
Test-Parameters: trivial
Test-Parameters: serverversion=2.10.8 serverdistro=el7.6 env=ONLY=103 testlist=sanityn
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: Ib7426f78210c9b32ba53c46ba5f08faeb3ea8ec5
Reviewed-on: https://review.whamcloud.com/38847
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
3 years agoLU-11782 tests: add version check to conf-sanity 117 51/38851/2
James Nunez [Fri, 5 Jun 2020 17:21:09 +0000 (11:21 -0600)]
LU-11782 tests: add version check to conf-sanity 117

conf-sanity test 117 was added to check error returns from
read_param().  This test will fail when run with servers
with Lustre version less than 2.12.0 and, thus, should be
skipped for all Lustre servers earlier than 2.12.0.

Fixes: 6ca2425ccf6b ("LU-11198 utils: propagate errors for read_param")
Test-Parameters: trivial
Test-Parameters: serverversion=2.10.8 serverdistro=el7.6 env=ONLY=117 testlist=conf-sanity
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: Ia0889584d9c1a6c09ea2a99fa11c7abfd1474de4
Reviewed-on: https://review.whamcloud.com/38851
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
3 years agoLU-13640 tests: add version check to conf-sanity 125 50/38850/2
James Nunez [Fri, 5 Jun 2020 16:55:44 +0000 (10:55 -0600)]
LU-13640 tests: add version check to conf-sanity 125

In Lustre 2.12.3, the l_tunedisk utility was modified to
skip tuning devices on the MDS and MGS and conf-santity
test 125 was added to check this functionality.  Thus, this
test should be skipped for all Lustre server versions prior
to 2.12.3.

Fixes: bab0570ce3081 ("LU-12387 tests: Validate l_tunedisk max_sectors_kb tuning")
Test-Parameters: trivial
Test-Parameters: serverversion=2.10.8 serverdistro=el7.6 env=ONLY=125 testlist=conf-sanity
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: I89c2900c2430ff3e76bee297809957380404aa31
Reviewed-on: https://review.whamcloud.com/38850
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13088 ldlm: Fix sleeping function called in atomic 83/39283/3
Mr NeilBrown [Thu, 19 Dec 2019 05:55:35 +0000 (16:55 +1100)]
LU-13088 ldlm: Fix sleeping function called in atomic

target_recovery_overseer() can sleep while holding a spinlock, which
triggers a BUG warning.

It is easily fixed by dropping the spinlock before waiting.  In the
case where the task waits, no useful information that could be
protected by the spinlock is held, so nothing can be lost by dropping
it.

Lustre-change: https://review.whamcloud.com/#/c/37063/
Lustre-commit: b29b9310dafe17ba78e1db490b79b89d2d6fdcd1

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I8bb3d02523b5dcfadac19f01ccb736d7b7f28239
Reviewed-on: https://review.whamcloud.com/37063
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/39283

3 years agoLU-13653 mdt: ignore quota when creating slave stripe 82/39282/2
Hongchao Zhang [Wed, 24 Jun 2020 09:53:55 +0000 (17:53 +0800)]
LU-13653 mdt: ignore quota when creating slave stripe

When creating striped directory, the quota limit has been checked
on master MDT, the quota should be ignored when creating the slave
stripe object.

Lustre-change: https://review.whamcloud.com/#/c/38875/
Lustre-commit: f762acebfcc6a88c3f4ba6296cbd6f1696bff530

Change-Id: Ia53b1975a8d66c78725feb313659f7a9b889e735
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/38875
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/39282
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
3 years agoLU-13709 utils: 'lfs mkdir -i -1' doesn't work 65/39165/3
Lai Siyao [Wed, 24 Jun 2020 12:01:08 +0000 (20:01 +0800)]
LU-13709 utils: 'lfs mkdir -i -1' doesn't work

'lfs mkdir -i -1 -c...' is to create directory on MDTs by space usage,
when stripe count is more than 1, the target MDT list is not correctly
initialized, which will cause command fail.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Id4584940cec390a9245e888c96c7873f5afa209e
Reviewed-on: https://review.whamcloud.com/39165
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13600 ptlrpc: limit rate of lock replays 11/39111/5
Mikhail Pershin [Fri, 12 Jun 2020 14:14:50 +0000 (17:14 +0300)]
LU-13600 ptlrpc: limit rate of lock replays

Clients send all lock replays at once and that may overwhelm
server with huge amount of replays in recovery queue causing
OOM effects.

Patch adds rate control for lock replays on client.

Patch includes also later fix for signal_completed_replay()
race.

Lustre-change: https://review.whamcloud.com/38920
Lustre-commit: 3b613a442b8698596096b23ce82e157c158a5874

Lustre-change: https://review.whamcloud.com/39140
Lustre-commit: dc654756af63bd30802ebd86074019d1533a4d8f

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: Ie557f8481c5facb690468d7136cf5feebe4e8f11
Reviewed-on: https://review.whamcloud.com/39111
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoLU-13657 kernel: kernel update RHEL8.2 [4.18.0-193.6.3.el8_2] 03/38903/4
Jian Yu [Tue, 7 Jul 2020 18:13:05 +0000 (11:13 -0700)]
LU-13657 kernel: kernel update RHEL8.2 [4.18.0-193.6.3.el8_2]

Update RHEL8.2 kernel to 4.18.0-193.6.3.el8_2 for Lustre client.

Test-Parameters: trivial clientdistro=el8.2

Change-Id: Id9eb16b9277bf2157905eb38a23a3250a0033560
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/38903
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13503 mdc: allow setting max_mod_rpcs_in_flight larger 93/38893/3
Andreas Dilger [Wed, 10 Jun 2020 21:34:03 +0000 (14:34 -0700)]
LU-13503 mdc: allow setting max_mod_rpcs_in_flight larger

Allow setting mdc.*.max_mod_rpcs_in_flight > mdc.*.max_rpcs_in_flight
by increasing the latter value, rather than returning an error and
telling the user to do that.  This matches the similar behavior if
mdc.*.max_rpcs_in_flight is reduced lower than max_mod_rpcs_in_flight.

If there are multiple MDTs, the "mdc.*.max_mod_rpcs_in_flight" param
may be set from e.g. the MDT0000 config log before MDT0001 is fully
configured, catching MDT0001 with ocd_maxmodrpcs = 0 before the OCD
from the MDT has been filled in, and incorrectly trigger an error.
If seen during setup, allow ocd_maxmodrpcs = (max_rpcs_in_flight - 1),
since this will be fixed up later if mdc.*.max_rpcs_in_flight is set
smaller in the config log (if set larger it doesn't matter).

Test-Parameters: env=ONLY=90 testlist=conf-sanity

This patch is back-ported from the following one:
Lustre-commit: 6d314902e6d19229379577aab60d4b20a5b4d2ea
Lustre-change: https://review.whamcloud.com/38455

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Change-Id: I4b20163e9e212db451738169ebdc361ab8c1c15e
Reviewed-on: https://review.whamcloud.com/38893
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-12100 tests: Use least qunit to set limit 69/38769/4
Nathaniel Clark [Tue, 19 Nov 2019 14:52:45 +0000 (09:52 -0500)]
LU-12100 tests: Use least qunit to set limit

Use least qunit to set lower limit for inodes in sanity-quota/2
This ensures that the limit is set at or above the minimum size.

Lustre-change: https://review.whamcloud.com/36797
Lustre-commit: 33e500cfb33406b8dddac46e1dfb5a3d59ff01c5

Test-Parameters: trivial
Test-Parameters: env=ONLY=2 testlist=sanity-quota
Test-Parameters: env=ONLY=2 testlist=sanity-quota fstype=zfs
Test-Parameters: env=ONLY=2,ONLY_REPEAT=20 fstype=zfs testlist=sanity-quota
Test-Parameters: mdtcount=2 mdscount=4 env=ONLY=2,ONLY_REPEAT=20 fstype=zfs testlist=sanity-quota

Signed-off-by: Nathaniel Clark <nclark@whamcloud.com>
Change-Id: I80e2c3cb66870d11f74f34c435e266a46630479b
Reviewed-on: https://review.whamcloud.com/36797
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Reviewed-on: https://review.whamcloud.com/38769
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoLU-13473 llite: don't check mirror info for page discard 56/38856/2
Bobi Jam [Wed, 22 Apr 2020 05:28:54 +0000 (13:28 +0800)]
LU-13473 llite: don't check mirror info for page discard

The CIT_MISC is used for locks/pages manipulation, it will not
go with full io procedure, i.e. cl_io_loop() will not be called
for it. So don't check it for plain file since the mirror info
is not initialized/set in this case.

Lustre-change: https://review.whamcloud.com/38307
Lustre-commit: d0dd744ed6ae002f34bdade993428b635b23d072

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I723d18260629b8f7c470d350d6d899d3bb88018a
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/38856
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-12865 tests: fix sanity 160f to be more robust 33/38833/2
Andreas Dilger [Thu, 17 Oct 2019 07:19:26 +0000 (16:19 +0900)]
LU-12865 tests: fix sanity 160f to be more robust

The sanity test_160f test was failing intermittently because the first
Changelog user ("cl6") was being unregistered in some cases when it
set changelog_max_idle_time=10, but the test slept for 9s and then did
some operations that could be slow.  In rare cases the test runs too
long and the MDS evicts the "good" user along with the bad user:

   MDD0000: Force deregister of ChangeLog user cl7 idle more than 35s
   MDD0000: Force deregister of ChangeLog user cl6 idle more than 11s

Change the test sleep interval to be half of the max_idle limit so
that there is no risk of the "good" Changelog user being evicted.

Add some logging to the test so that it is easier to correlate test
script actions with events in the MDS debug log.

Lustre-change: https://review.whamcloud.com/36468
Lustre-commit: 4b0f0164c6ed761897409186376e9edc989323c9

Fixes: 31fef6845e8b ("LU-10680 mdd: create gc thread when no current transaction")
Test-Parameters: trivial envdefinitions=ONLY=160 testlist=sanity,sanity
Test-Parameters: envdefinitions=ONLY=160 mdscount=2 testlist=sanity,sanity

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I0e4c9c271d98a2716f848e75676780b0383ebbe5
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/38833
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-13421 kernel: kernel update RHEL8.1 [4.18.0-147.8.1.el8_1] 27/38227/7
Jian Yu [Thu, 28 May 2020 07:44:34 +0000 (00:44 -0700)]
LU-13421 kernel: kernel update RHEL8.1 [4.18.0-147.8.1.el8_1]

Update RHEL8.1 kernel to 4.18.0-147.8.1.el8_1 for Lustre client.

Test-Parameters: trivial clientdistro=el8.1

Change-Id: I4c8d925f295927ed7b7fd8fd5d17754d720bfc4d
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/38227
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-12761: tests: make version_code() accept two number versions too 15/38715/2
Oleg Drokin [Mon, 23 Sep 2019 12:39:48 +0000 (08:39 -0400)]
LU-12761: tests: make version_code() accept two number versions too

There's now a user in sanity test 103a that calls version_code with
2.6.  Andreas rightfully points instead of fixing the caller we can
just update the code to accept this usage.

Lustre-change: https://review.whamcloud.com/36275
Lustre-commit: 6521dda6f4377c9c688ce4905cd94adf9f99013f

Change-Id: I5915cd08a36946c6d26f2e231aa7a820a3eef46a
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/38715
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-13553 lnd: gracefully handle unexpected events 52/38752/2
Amir Shehata [Wed, 20 May 2020 05:21:10 +0000 (22:21 -0700)]
LU-13553 lnd: gracefully handle unexpected events

When a tx completes kiblnd_tx_complete() callback is invoked.
We ensure:
LASSERT (tx->tx_sending > 0);
However this assert is being triggered in some rare scenarios.
The reason tx_sending would be 0 at this point is because:
 1. ib_post_send() failed but OFED stack is still sending
    a tx complete event.
 2. We're getting two different events for the same tx

Instead of asserting, ignore that tx_complete event and print
the tx pointer and its status.

Lustre-change: https://review.whamcloud.com/38669
Lustre-commit: 60f9f539e686fc19b080a3cda15ade7111bbd4a7

Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I8cd192538c0c80abaef23a4b6e6906936043060b
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/38752
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13225 utils: fix install path for bash-completion 70/38670/2
Andreas Dilger [Fri, 8 May 2020 23:28:39 +0000 (17:28 -0600)]
LU-13225 utils: fix install path for bash-completion

Fix the default install path for bash-completion if the package is
not installed at build time.  This avoids BASH_COMPLETION_DIR being
badly formatted in the lustre.spec file.

Lustre-change: https://review.whamcloud.com/38548

Fixes: dfb4afc24102 ("LU-13225 utils: bash completion for lfs and lctl")
Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ie50071c4ff86f57bc9dd53409ae339da2a3ebbe5
Reviewed-on: https://review.whamcloud.com/38670
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Ben Evans <beevans@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13225 utils: bash completion for lfs and lctl 19/38519/5
Andreas Dilger [Sat, 8 Feb 2020 08:25:29 +0000 (01:25 -0700)]
LU-13225 utils: bash completion for lfs and lctl

Add a bash completion for "lfs" and improve completion for "lctl".
Rename the "lctl" completion script to "lustre" since the two
commands share helper routines for fsnames, pools, etc. and install
"lfs" and "lctl" symlinks to the common command file.

The completion prints available sub-commands and their options,
and for some sub-commands it completes available arguments
(e.g. mount points, pool names, and MDT/OST names).

A couple of minor changes to "lfs" and "lctl" usage messages to make
the sub-command options easier to parse.  More needs to be done to
make all sub-commands have proper long options.

There is definitely more that could be added to the completions,
but this is a good start and provides a framework for the future.

Lustre-change: https://review.whamcloud.com/37483
Lustre-commit: dfb4afc24102ee305d4901dc76944f4c91887633

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ie989b2ef4c0d6d8565e5c6753205bb6ed83ebbe5
Reviewed-by: Dominique Martinet <dominique.martinet@cea.fr>
Reviewed-by: Quentin Bouget <quentin.bouget@cea.fr>
Reviewed-by: Ben Evans <beevans@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-on: https://review.whamcloud.com/38519
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-11986 libcfs: lnet_remove_debugfs() compat for RHEL6 16/38716/3
Jian Yu [Mon, 25 May 2020 18:27:22 +0000 (11:27 -0700)]
LU-11986 libcfs: lnet_remove_debugfs() compat for RHEL6

Unloading libcfs module on RHEL 6.10 Lustre client with
kernel 2.6.32-754.24.3 hit kernel panic issue. The issue
doesn't exist in Lustre b2_10 where RHEL 6.10 is supported
and debugfs_remove_recursive() is called directly from
lnet_remove_debugfs(). This patch adds compat changes to
lnet_remove_debugfs() to resolve the issue.

Fixes: 9d42660e173e ("LU-11986 lnet: properly cleanup lnet debugfs files")
Fixes: ae93a9f21752 ("LU-11986 libcfs: add compat for d_hash_and_lookup()")
Test-Parameters: trivial
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Change-Id: Ib63a40afe8926f56cd1d2873975855c226098418
Reviewed-on: https://review.whamcloud.com/38716
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-10395 osd: stop OI at device shutdown 53/38153/3
Alex Zhuravlev [Tue, 18 Feb 2020 15:04:44 +0000 (18:04 +0300)]
LU-10395 osd: stop OI at device shutdown

and not at obd_cleanup(). otherwise a race is possible:
 umount <MDT> stopping OI vs MGS accessing same OSD which
results in the assertion:
ASSERTION( osd->od_oi_table != NULL && osd->od_oi_count >= 1 )

Lustre-change: https://review.whamcloud.com/37615
Lustre-commit: 2789978e1192dbf6d90399c96b5594e0dc049cd9

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I24fccea718f2e2663166cfb0ff26571039357535
Reviewed-on: https://review.whamcloud.com/38153
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-11669 tests: add project in yml_test_group() 84/38684/2
Jian Yu [Wed, 20 May 2020 23:56:37 +0000 (16:56 -0700)]
LU-11669 tests: add project in yml_test_group()

This patch fixes yml_test_group() in yaml.sh to
add test project name, which is required in
results.yml for Maloo to parse.

This patch is back-ported from the following one:
Lustre-commit: 054bb02880d26bacc4e7080869955c2039bbf986
Lustre-change: https://review.whamcloud.com/33658

Test-Parameters: trivial

Change-Id: I0ae563d855dc2d28eaea85e86b1cb23d2428988b
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Leonel Ochoa <lochoa@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/38684
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoLU-11269 ptlrpc: request's counter in import 40/38340/2
Alex Zhuravlev [Tue, 25 Feb 2020 16:44:18 +0000 (19:44 +0300)]
LU-11269 ptlrpc: request's counter in import

which is separate from imp_refcount as the latter can be
used for other purposes and it's hard to use to track
requests.

to verify the theory that imp_refcount should be checked.

Lustre-change: https://review.whamcloud.com/37722
Lustre-commit: b09afdf57643cbc1c437a42b4babb0837dd19e65

Change-Id: I7c273a73e2b1bb43059c7ed003ee2b7d09273bfe
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/38340
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-11623 llite: hash just created files if lock allows 63/38763/3
Oleg Drokin [Tue, 6 Nov 2018 00:26:44 +0000 (19:26 -0500)]
LU-11623 llite: hash just created files if lock allows

If open|creat (and other intent operations later) returned a lookup
bit as part of the lock, hash the resultant dentry under this lock,
not to trigger further RPCs in subsequent lookups.

Lustre-change: https://review.whamcloud.com/33584
Lustre-commit: fc42cbe0e2e5d1d87d0edca73986b831ac718301

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Change-Id: Id5140d1042af7f5ab9052922e11a7eda8f92a29a
Reviewed-on: https://review.whamcloud.com/38763
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoLU-13357 lod: implement striped directory .dio_lookup 91/38691/6
Lai Siyao [Thu, 12 Mar 2020 00:35:20 +0000 (08:35 +0800)]
LU-13357 lod: implement striped directory .dio_lookup

Add function lod_striped_lookup() for
lod_striped_index_ops.dio_lookup to allow name lookup under striped
directory.

Currently this is used by subdir mount, which needs to lookup FID
of the subdir on server side.

Function lfsck_namespace_repair_dirent() should call dt_lookup() with
bottom object, because child may be shard.

Add sanity 247f.

Lustre-change: https://review.whamcloud.com/37903
Lustre-commit: 42b0304e2571a80effe5bc4ab6fb58acfabb361d

Change-Id: Iba844d1a34a318bcbd42b00186ed6fa9d165effc
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/38691
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoNew release 2.12.5 2.12.5 v2_12_5
Oleg Drokin [Mon, 8 Jun 2020 13:18:56 +0000 (09:18 -0400)]
New release 2.12.5

Change-Id: I2bd2c42ba57730856fe454999f67b870b41330e8
Signed-off-by: Oleg Drokin <green@whamcloud.com>
3 years agoNew Rc 2.12.5-RC1 2.12.5-RC1 v2_12_5-RC1
Oleg Drokin [Wed, 27 May 2020 22:50:20 +0000 (18:50 -0400)]
New Rc 2.12.5-RC1

Change-Id: I2db2a133d8d8fc8479cc36e3714f4f62b2ea2dd5

3 years agoLU-13416 ldiskfs: don't corrupt data on journal replay 05/38705/4
Alexey Lyashkov [Mon, 20 Apr 2020 09:45:52 +0000 (12:45 +0300)]
LU-13416 ldiskfs: don't corrupt data on journal replay

Journalled write want a special attention on blocks release,
revoke records must added to avoid replace new write blocks
with stale data. Mark inode as "journal write" to generate
valid revoke records. Large EA inode updates affected
with this bug also.

large ea fix is

Linux-commit: ddfa17e4adc4bd19c32216aaa6250dc38b0579df
Author: Tahsin Erdogan <tahsin@google.com>
Date:   Wed Jun 21 21:36:51 2017 -0400
ext4: call journal revoke when freeing ea_inode blocks

Lustre-change: https://review.whamcloud.com/38281
Lustre-commit: a23aac2219047cb04ed1fa555f31fa39e5c499dc

Change-Id: I605128c4ba70331a48715dc95546430909efb893
Signed-off-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/38705
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13589 utils: fix lfs setstripe unit parsing 80/38680/2
Andreas Dilger [Wed, 20 May 2020 18:19:32 +0000 (12:19 -0600)]
LU-13589 utils: fix lfs setstripe unit parsing

The "size_units" variable was not being reset while parsing different
"lfs setstripe" arguments, so e.g. "lfs setstripe -E 1M -S 65536 ..."
ended up using the 'M' unit for the stripe size, which resulted in a
stripe_size of 65536MiB = 64GiB, which resulted in an error.

This only appeared with PFL or other layout patterns which had more
than one unit being parsed, and was already fixed in master via SEL.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ib3f9be86f5104aaea4f77d87853255a518cbc3a0
Reviewed-on: https://review.whamcloud.com/38680
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13535 lfsck: fix possible PFL layout corruption 85/38585/5
Mikhail Pershin [Tue, 12 May 2020 20:32:22 +0000 (23:32 +0300)]
LU-13535 lfsck: fix possible PFL layout corruption

While checking lmm_oi in composite layout the pointer to 'lmm'
is re-assigned to component entry but the same pointer is used
for LOV EA buffer to update EA. Therefore if lmm_oi was fixed in
some component then just current entry is saved as new layout.

Lustre-change: https://review.whamcloud.com/38584
Lustre-commit: be009cb4a73b3bef7302083bec7d1d6289d515b7

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: Ifbd984a71b383ab4ca35ad59ed9cd8cf57b6d7cc
Reviewed-on: https://review.whamcloud.com/38585
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Stephan Thiell <sthiell@stanford.edu>
3 years agoLU-13111 kernel: new kernel [SLES12 SP5 4.12.14-122.20.1] 40/38640/3
Jian Yu [Sun, 17 May 2020 07:39:39 +0000 (00:39 -0700)]
LU-13111 kernel: new kernel [SLES12 SP5 4.12.14-122.20.1]

This patch makes changes to support new SLES12 SP5 release
for Lustre client.

Test-Parameters: trivial clientdistro=sles12sp5 \
env=SANITY_EXCEPT="56oc 817" testlist=sanity

Change-Id: Ia4b856b03801e02da9a2e584efeb8759b4dd30c3
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/38640
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13556 kernel: kernel update RHEL7.8 [3.10.0-1127.8.2.el7] 35/38635/2
Jian Yu [Sat, 16 May 2020 23:56:34 +0000 (16:56 -0700)]
LU-13556 kernel: kernel update RHEL7.8 [3.10.0-1127.8.2.el7]

Update RHEL7.8 kernel to 3.10.0-1127.8.2.el7.

Test-Parameters: trivial clientdistro=el7.8 serverdistro=el7.8

Change-Id: If7ac6f4b5f1fe32a15c63f51589a2e320001b4a5
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/38635
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoLU-13488 kernel: new kernel [RHEL 8.2 4.18.0-193.1.2.el8] 61/38461/5
Jian Yu [Fri, 22 May 2020 18:06:34 +0000 (11:06 -0700)]
LU-13488 kernel: new kernel [RHEL 8.2 4.18.0-193.1.2.el8]

This patch makes changes to support new RHEL 8.2 release
for Lustre client.

Test-Parameters: trivial clientdistro=el8.2 \
env=SANITY_EXCEPT="130" testlist=sanity

Change-Id: Icb1db3afd2e94423a45354acfdd559f8f1e294cb
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/38461
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-12904 o2ib: ib_destroy_cq() returns void 89/38489/3
Shaun Tancheff [Mon, 4 May 2020 22:03:38 +0000 (15:03 -0700)]
LU-12904 o2ib: ib_destroy_cq() returns void

Kernel destroy CQ flows can't fail and the returned value of
ib_destroy_cq() is not interested in those flows.

kernel-commit: 890ac8d97e6722a9e4a66a0bd836d1b028d075fe

This patch is back-ported from the following one:
Lustre-commit: 7d2ea1e5bbd80f23e6935174c969b34b58048443
Lustre-change: https://review.whamcloud.com/36578

Test-Parameters: trivial
Cray-bug-id: LUS-8042
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Change-Id: I873bf76a33bd80d5e6de4d1b16a79ff5ea930f3a
Reviewed-on: https://review.whamcloud.com/38489
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-12634 lnet: for_ifa removed. Use in_dev_for_each_ifa_rtnl 87/38487/3
Shaun Tancheff [Mon, 4 May 2020 21:59:47 +0000 (14:59 -0700)]
LU-12634 lnet: for_ifa removed. Use in_dev_for_each_ifa_rtnl

Linux 5.3 removed for_ifa and replaced it with an _rntl and _rcu
versions for use with their respective locking primitives.

kernel-commit: ef11db3310e272d3d8dbe8739e0770820dd20e52

This patch is back-ported from the following one:
Lustre-commit: 6e0d0146276353559c821916e193c90d167b14e0
Lustre-change: https://review.whamcloud.com/35744

Test-Parameters: trivial
Cray-bug-id: LUS-7689
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Change-Id: Iea07222b9abb3f9c219d28fe2c660d9eaf21af80
Reviewed-on: https://review.whamcloud.com/38487
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-12400 ptlrpc: Sun RPC changes for RCU locking 83/38483/3
Shaun Tancheff [Mon, 4 May 2020 20:16:00 +0000 (13:16 -0700)]
LU-12400 ptlrpc: Sun RPC changes for RCU locking

In kernel 4.20 SUNRPC cache_detail->hash_lock changed to spinlock_t

  Now that the reader functions are all RCU protected, use a regular
  spinlock rather than a reader/writer lock.

Linux-commit: 1863d77f15da0addcd293a1719fa5d3ef8cde3ca

This patch is back-ported from the following one:
Lustre-commit: 77d53777e32c80047cb75293d5f9a4c0d23bbea8
Lustre-change: https://review.whamcloud.com/35499

Test-Parameters: trivial
Cray-bug-id: LUS-7600
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Change-Id: If0df38337d5a2bb0ac4b8cb645dbe89f65e0f352
Reviewed-on: https://review.whamcloud.com/38483
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-12355 llite: include file linux/selinux.h removed 80/38480/3
Shaun Tancheff [Mon, 4 May 2020 20:01:34 +0000 (13:01 -0700)]
LU-12355 llite: include file linux/selinux.h removed

In kernel 5.1 linux/selinux.h was removed with
SELinux: Remove unused selinux_is_enabled

Linux-commit: 3d252529480c68bfd6a6774652df7c8968b28e41

This patch is back-ported from the following one:
Lustre-commit: 39e5bfa73414d18738001761b42ea0e3264c2983
Lustre-change: https://review.whamcloud.com/35035

Test-Parameters: trivial
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Change-Id: If963e6b22b7b07899de5b970f934bb157c5f7cec
Reviewed-on: https://review.whamcloud.com/38480
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-12355 llite: Lustre specific iov_for_each broken (removed) 79/38479/3
Shaun Tancheff [Fri, 22 May 2020 17:59:40 +0000 (10:59 -0700)]
LU-12355 llite: Lustre specific iov_for_each broken (removed)

Kernel 4.20 introduced iov_iter_type and broke iov_for_each

As iov_for_each is only used once so drop the macro entirely.
When iov_iter_type is available ignore invalid iter types.

Linux-commit: 8a363970d1dc38c4ec4ad575c862f776f468d057

Kernel 3.15 added type to iov_iter. Use the type to provide
a sensible replacement for iov_iter_type when it is available.

Linux-commit: 71d8e532b1549a478e6a6a8a44f309d050294d00

This patch is back-ported from the following one:
Lustre-commit: d93aa0171a25f8ffca51bed35a2d477a45fda0f3
Lustre-change: https://review.whamcloud.com/35024

Cray-bug-id: LUS-6962
Change-Id: I97cdce1c85803ac2d4436d4eedf67a545ea2cdb8
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-on: https://review.whamcloud.com/38479
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-12400 llite: Use the new vm_fault_t type 78/38478/3
Shaun Tancheff [Fri, 22 May 2020 17:49:45 +0000 (10:49 -0700)]
LU-12400 llite: Use the new vm_fault_t type

Linux 4.17 created the new vm_fault_t type

Linux-commit: 1c8f422059ae5da07db7406ab916203f9417e396

Linux 5.1 changed the vm_fault_t type to bitwise unsigned int
which changes the interfaces registered to struct vm_operations_struct

Linux-commit: 3d3539018d2cbd12e5af4a132636ee7fd8d43ef0

Prefer to match the upstream API and fallback to 'int'
where vm_fault_t is not available.

This patch is back-ported from the following one:
Lustre-commit: f2b224a48cb00f885b9df2cc56e349dae5f27f9e
Lustre-change: https://review.whamcloud.com/35500

Test-Parameters: trivial
Cray-bug-id: LUS-7600
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Change-Id: I7122fb0d4af3ee9a19c1a5d0b77c4f13f6850181
Reviewed-on: https://review.whamcloud.com/38478
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>